Plants comprising wheat g-type cytoplasmic male sterility restorer genes,  molecular markers and uses thereof

ABSTRACT

Methods are described for selecting or producing a cereal plant comprising a functional restorer gene for wheat G-type cytoplasmic male sterility and nucleic acids for use therein.

FIELD OF THE INVENTION

The present invention relates generally to the field of plant breedingand molecular biology and concerns a method for selecting or producing acereal plant comprising a functional restorer gene for wheat G-typecytoplasmic male sterility, and nucleic acids for use therein.

BACKGROUND

Cytoplasmic male sterility (CMS) is a major trait of interest in cerealssuch as wheat in the context of commercial hybrid seed production(Kihara, 1951; Wilson and Ross, 1962; Lucken, 1987; Sage, 1976). Thecytoplasms of Triticum timopheevi (G-type) and Aegilops kotschyi(K-type) are widely studied as inducers of male sterility in commonwheat (Triticum aestivum), due to few deleterious effects (Kaul, 1988;Lucken, 1987; Mukai and Tsunewaki, 1979).

In hybrid seed production system using the G-type cytoplasm, fertilityrestoration is a critical problem. Most of the hexaploid wheats do notnaturally contain fertility restoration genes (Ahmed et al. Genes Genet.Syst. 2001). In the complicated restoration system of T. timopheevi,eight Rf genes are reported to restore the fertility against T.timopheevii cytoplasm, and their chromosome locations have beendetermined, namely, Rf1 (1A), Rf2 (7D), Rf3 (1B), Rf4 (6B), Rf5 (6D),Rf6 (5D), Rf7 (7B) and Rf8 (Tahir & Tsunewaki, 1969; Yen et al., 1969;Bahl & Maan, 1973; Du et al., 1991; Sihna et al., 2013). Ma et al.(1991) transferred an Rf gene from Aegilops umbellulata to wheat, thegene being located on chromosomes 6AS and 6BS (from Zhou et al., 2005).

Ma and Sorrels (Crop Science 1995) reported the linkage of Rf3 to RFLPmarkers Xbcd249 and Xcdo442 on chromosome 1BS.

Kojima (Genes Genet Syst 1997) localized a fertility restorer gene fromChinese Spring termed Rf3 gene at a position 1.2 cM and 2.6 cM distantfrom RFLP markers Xcdo388 and Xabc156, respectively, although theauthors were able to separate Rf3 from Xcdo388. It was estimated thatthat Rf3 could exist within a region of 500 Kbp of the adjacent RFLPmarkers.

Ahmed Talaat et al (Genes Genet. Syst., 2001) determined the closelinkage of a major Rf QTL against G-type cytoplasm on chromosome 1B withRFLP marker XksuG9c, close to marker Xabc156 as reported by Kojima et al(supra).

Zhang et al., (Yi Chuan Xue Bao 2003) describe an Rf gene located on 1BSwith a genetic distance of 5.1 cM to microsatellite marker Xgwm550.

Zhou et al (2005) describe Rf3 gene to be located either between SSRmarkers Xgwm582 and Xbarc207 or between Xbarc207 and Xgwm131 but veryclose to Xbarc207. Since the previously identified RFLP markers ofKojima, Ahmed and Ma & Sorrels were not mapped in their mappingpopulation, a linkage map including these RFLP markers could not beconstructed to better estimate the distance between Rf3 and theidentified SSR markers.

Accordingly, there remains the need for more accurate markers toidentify and track Rf loci in breeding, which are particularly usefulfor hybrid seed production, and for improved methods for fertilityrestoration in wheat Thimopheevi cytoplasm. The present inventionprovides a contribution over the art by disclosing the functional Rfgene on chromosome 1B and by providing markers that are more tightlylinked to the causal gene.

FIGURE LEGENDS

FIG. 1: Seed set on the main head (ss_mh), as observed in two differentlocations (g, m). Number of plants (y-axis) per class of amount of seed(x-axis).

FIG. 2: Profile plot for significance of marker-trait associations alongchromosome 1B in -log 10(p) Indicative threshold=3.9.

FIG. 3: (A)—Predicted gene structure for the identified PPR gene. @indicates CDS, #5′ UTR, and * 3 UTR (B) amino acid sequence ofidentified PPR gene indicating the transit peptide (italic) and the PPRmotifs (alternatingly underlined and not underlined) including the 5thand 35th amino acid implied in RNA recognition (bold). (C) Graphicalrepresentation of the structure of the PPR protein with transit peptideand PPR motifs.

FIG. 4: (A) Overall alignment of the putative RNA recognition motif ofthe identified PPR protein with ORF256. (B) Close-up showing nucleotidealignment.

FIG. 5: Mean normalized expression levels of Rf3-PPR in tissues of Rf3restorer and wild-type (non-restorer) F4 progeny of a cross between‘Resource-5’ and a CMS line. Rf3-containing progeny were identifiedfollowing KASP genotyping with fine-mapping markers and phenotyped toconfirm restoration of fertility.

DETAILED DESCRIPTION

The present invention describes the identification of a functionalrestorer (Rf) locus and gene for wheat G-type cytoplasmic male sterility(i.e., T. timopheevi cytoplasm) located on chromosome 1B (short arm 1BS), as well as markers associated therewith. Said markers can be usedin marker-assisted selection (MAS) of cereal plants, such as wheat,comprising said functional restorer genes located on chromosomes 1B. Theidentification of the genes and markers are therefore extremely usefulin methods for hybrid seed production, as they can be used e.g. in amethod for restoring fertility in progeny of a plant possessing G-typecytoplasmic male sterility, thereby producing fertile progeny plantsfrom a G-type cytoplasmic male sterile parent plant. Likewise, thepresent disclosure also allows identifying plants lacking the desiredallele, so that non-restorer plants can be identified and, e.g.,eliminated from subsequent crosses.

One advantage of marker-assisted selection over field evaluations forfertility restoration is that MAS can be done at any time of yearregardless of the growing season. Moreover, environmental effects areirrelevant to marker-assisted selection.

When a population is segregating for multiple loci affecting one ormultiple traits, e.g., multiple loci involved in fertility restorationor multiple loci each involved in fertility restoration of differentcytoplasmic male sterility (CMS) systems or loci affecting distincttraits (for example fertility and disease resistance) the efficiency ofMAS compared to phenotypic screening becomes even greater because allthe loci can be processed in the lab together from a single sample ofDNA. Any one or more of the markers and/or marker alleles, e.g., two ormore, up to and including all of the established markers, can be assayedsimultaneously.

Another use of MAS in plant breeding is to assist the recovery of therecurrent parent genotype by backcross breeding. Backcross breeding isthe process of crossing a progeny back to one of its parents.Backcrossing is usually done for the purpose of introgressing one or afew loci from a donor parent into an otherwise desirable geneticbackground from the recurrent parent. The more cycles of backcrossingthat are done, the greater the genetic contribution of the recurrentparent to the resulting variety. This is often necessary, because donorparent plants may be otherwise undesirable, i.e., due to low yield, lowfecundity or the like. In contrast, varieties which are the result ofintensive breeding programs may have excellent yield, fecundity or thelike, merely being deficient in one desired trait such as fertilityrestoration. As a skilled worker understands, backcrossing can be doneto select for or against a trait. For example, in the present invention,one can select a restorer gene for breeding a restorer line or oneselect against a restorer gene for breeding a maintainer (female pool).

The presently described Rf locus on chromosome 1B was mapped to asegment along the chromosome 1B, in an interval of about 15.8 cM, saidinterval being flanked by markers of SEQ ID NO 2 and SEQ ID NO 8.

Thus, in a first aspect, a method is provided for selecting a cerealplant comprising a functional restorer gene allele for wheat G-typecytoplasmic male sterility or for producing a cereal plant comprising afunctional restorer gene allele for wheat G-type cytoplasmic malesterility, comprising the steps of:

-   -   (a) Identifying at least one cereal plant comprising at least        one marker allele linked to a functional restorer gene allele        for wheat G-type cytoplasmic male sterility located on        chromosome 1B; and    -   (b) Selecting the plant comprising said at least one marker        allele, wherein said plant comprises said functional restorer        gene for wheat G-type cytoplasmic male sterility located on        chromosome 1B    -   wherein said at least one marker allele localises within an        interval on chromosome 1B comprising and flanked by the markers        of SEQ ID NO 2 and SEQ ID NO 8.

In a second aspect, a method is provided for restoring fertility in aprogeny of a G-type cytoplasmic male sterile cereal plant OR forproducing a fertile progeny plant from a G-type cytoplasmic male sterilecereal parent plant, comprising the steps of

-   -   (a) Providing a population of progeny plants obtained from        crossing a female cereal parent plant with a male cereal parent        plant, wherein the female parent plant is a G-type cytoplasmic        male sterile cereal plant, and wherein the male parent plant        comprises a functional restorer gene allele for wheat G-type        cytoplasmic male sterility located on chromosome 1B;    -   (b) Identifying in said population a fertile progeny plant        comprising at least one marker allele linked to said functional        restorer gene allele for wheat G-type cytoplasmic male        sterility, wherein said progeny plant comprises said functional        restorer gene allele for wheat G-type cytoplasmic male sterility        located on chromosome 1B; and optionally    -   (c) Selecting said fertile progeny plant; and optionally    -   (d) Propagating the fertile progeny plant,    -   wherein said at least one marker allele localises within an        interval on chromosome 1B comprising and flanked by the markers        of SEQ ID NO 2 and SEQ ID NO 8.

Male sterility in connection with the present invention refers to thefailure or partial failure of plants to produce functional pollen ormale gametes. This can be due to natural or artificially introducedgenetic predispositions or to human intervention on the plant in thefield. Male fertile on the other hand relates to plants capable ofproducing normal functional pollen and male gametes. Malesterility/fertility can be reflected in seed set upon selfing, e.g. bybagging heads to induce self-fertilization. Likewise, fertilityrestoration can also be described in terms of seed set upon crossing amale sterile plant with a plant carrying a functional restorer gene,when compared to seed set resulting from crossing (or selfing) fullyfertile plants.

A male parent or pollen parent, is a parent plant that provides the malegametes (pollen) for fertilization, while a female parent or seed parentis the plant that provides the female gametes for fertilization, saidfemale plant being the one bearing the seeds.

Cytoplasmic male sterility or “CMS” refers to cytoplasmic-based andmaternally-inherited male sterility. CMS is total or partial malesterility in plants as the result of specific nuclear and mitochondrialinteractions and is maternally inherited via the cytoplasm. Malesterility is the failure of plants to produce functional anthers,pollen, or male gametes although CMS plants still produce viable femalegametes. Cytoplasmic male sterility is used in agriculture to facilitatethe production of hybrid seed.

“Wheat G-type cytoplasmic male sterility”, as used herein refers to thecytoplasm of Triticum timopheevi that can confer male sterility whenintroduced into common wheat (i.e. Triticum aestivum), thereby resultingin a plant carrying common wheat nuclear genes but cytoplasm fromTriticum timopheevii that is male sterile. The cytoplasm of Triticumtimopheevi (G-type) as inducers of male sterility in common wheat havebeen extensively studied (Wilson and Ross, Genes Genet. Syst. 1962;Kaul, Male sterility in higher plants. Springer Verlag, Berlin. 1988;Lucken, Hybrid wheat. In Wheat and wheat improvement. Edited by E. G.Heyne. American Society of Agronomy, Madison, Wis., 1987; Mukai andTsunewaki, Theor. Appl. Genet. 54, 1979; Tsunewaki, Jpn. Soc. Prom. Sci.1980; Tsunewaki et al., Genes Genet. Syst. 71, 1996). The origin of theCMS phenotype conferred by T. timopheevi cytoplasm is with a novelchimeric gene termed orf256, which is upstream of coxl sequences and iscotranscribed with an apparently normal cox1 gene. Antisera preparedagainst polypeptide sequences predicted from orf256 recognized a 7-kDaprotein present in the CMS line but not in the parental or restoredlines (Song and Hedgcoth, Genome 37(2), 1994; Hedgcoth et al., Curr.Genet. 41, 357-365, 2002).

As used herein “a functional restorer gene allele for wheat G-typecytoplasmic male sterility” or “a functional restorer locus for wheatG-type cytoplasmic male sterility” or a “restorer QTL for wheat G-typecytoplasmic male sterility” indicates an allele that has the capacity torestore fertility in the progeny of a cross with a G-type cytoplasmicmale sterility (“CMS”) line, i.e., a line carrying common wheat nucleargenes but cytoplasm from Triticum timopheevii. Restoration againstG-type cytoplasm has e.g. been described by Robertson and Curtis (CropSci. 9, 1967), Yen et al. (Can. J. Genet. Cytol. 11, 1969), Bahl andMaan (Crop Sci. 13, 1973), Talaat et al. (Egypt. J. Genet. 2, 195-205,1973) Zhang et al., (2003, supra) Ma and Sorrels (1995, supra), Kojima(1997, supra), Ahmed Talaat et al (2001, supra), Zhou et al (2005,supra). Such restorer genes or alleles are also referred to as Rf genesand Rf alleles.

The term “maintainer” refers to a plant that when crossed with the CMSplant does not restore fertility, and maintains sterility in theprogeny. The maintainer is used to propagate the CMS line, and may alsobe referred to as a non-restorer line. Maintainer lines have the samenuclear genes as the sterile one (i.e. do not contain functional Rfgenes), but differ in the composition of cytoplasmic factors that causemale sterility in plants i.e. maintainers have “fertile” cytoplasm.Therefore when a male sterile line is crossed with its maintainerprogeny with the same male sterile genotype will be obtained.

The term “cereal” relates to members of the monocotyledonous familyPoaceae which are cultivated for the edible components of their grain.These grains are composed of endosperm, germ and bran. Maize, wheat andrice together account for more than 80% of the worldwide grainproduction. Other members of the cereal family comprise rye, oats,barley, triticale, sorghum, wild rice, spelt, einkorn, emmer, durumwheat and kamut.

In one embodiment, a cereal plant according to the invention is a cerealplant that comprises at least a B genome or related genome, such aswheat (Triticum aestivum; ABD), spelt (Triticum spelta; ABD) durum (T.turgidum; AB), barley (Hordeum vulgare; H) and rye (Secale cereale; R).In a specific embodiment, the cereal plant according to the invention iswheat (Triticum aestivum; ABD).

A “molecular marker” or “marker” or “marker nucleic acid” or “geneticmarker”, as used herein, refers to a polymorphic locus, i.e. apolymorphic nucleotide (a so-called single nucleotide polymorphism orSNP) or a polymorphic DNA sequence at a specific locus. A marker refersto a measurable, genetic characteristic with a fixed position in thegenome, which is normally inherited in a Mendelian fashion, and whichcan be used for mapping of a trait of interest or to identify certainindividuals with a certain trait of interest. A marker thus refers to agene or nucleotide sequence that can be used to identify plants having aparticular allele, e.g., the presently described Rf alleles onchromosome 1B. A marker may be described as a variation at a givengenomic locus. It may be a short DNA sequence, such as a sequencesurrounding a single base-pair change (single nucleotide polymorphism,or “SNP”), or a long one, for example, a microsatellite/simple sequencerepeat (“SSR”). A molecular marker may also include ‘Indels’ whichrefers to the insertion or the deletion of bases or a combination ofboth in the DNA of an organism, and which can be used as molecularmarkers.

The term “marker genotype” refers to the combination of marker allelespresent at a polymorphic locus on each chromosome of the chromosomepair. The term “marker allele” refers to the version of the marker thatis present in a particular plant at one of the chromosomes. Typically, amarker can exist as or can be said to have or to comprise two markeralleles. The term “haplotype”, as used herein, refers to a specificcombination of marker alleles as present within a certain plant or groupof (related) plants. See also the below definitions of a SNP (marker)genotype and SNP (marker) allele.

A “marker context” or “marker context sequence”, as used herein, refersto 50-150 bp upstream of a marker, such as a SNP marker, and/or 50-150bp downstream of such a marker. The marker context of the hereindescribed (SNP) markers is given in the sequence listing, flanking theSNP position. The upstream and downstream sequences of a (SNP) markercan also be referred to as (upstream and/or downstream) flankingsequences.

Identifying a cereal plant comprising at least one marker allele linkedto a functional restorer gene allele for wheat G-type cytoplasmic malesterility located on chromosome 1B can be accomplished using a molecularmarker assay that detects the presence of at least one such markerallele, e.g. the marker alleles described herein that are linked to thefunctional restorer gene allele for wheat G-type cytoplasmic malesterility located on chromosome 1B. This can involve obtaining orproviding a biological sample, i.e. plant material, or providing genomicDNA of a plant, and analyzing the genomic DNA of the material for thepresence of at least one of said marker alleles (or the marker genotypefor at least one of such markers). In this method also other molecularmarker tests described elsewhere herein can be used.

As will be well known to a person skilled in the art, markers and markerassays include for example Restriction Fragment Length Polymorphisms(RFLPs), Random Amplified Polymorphic DNA's (RAPDs), Amplified FragmentLength Polymorphism's (AFLPs), DAF, Sequence Characterized AmplifiedRegions (SCARs), microsatellite or Simple Sequence Repeat markers(SSRs), Sequence Characterized Amplified Regions (SCARs),single-nucleotide polymorphisms (SNPs), KBioscience CompetitiveAllele-Specific PCR (KASPar), as inter alia described in Jonah et al.(Global Journal of Science Frontier Research 11:5, 2011) and Lateef(Journal of Biosciences and Medicines, 2015, 3, 7-18).

As used herein, the term “single nucleotide polymorphism” (SNP) mayrefer to a DNA sequence variation occurring when a single nucleotide inthe genome (or other shared sequence) differs between members of aspecies or paired chromosomes in an individual. [0057] Within apopulation, SNPs can be assigned a minor allele frequency the lowestallele frequency at a locus that is observed in a particular population.This is simply the lesser of the two allele frequencies forsingle-nucleotide polymorphisms. There are variations between variouspopulations, so a SNP allele that is common in one geographical group orvariety may be much rarer in another.

Single nucleotide polymorphisms may fall within coding sequences ofgenes, non-coding regions of genes, or in the intergenic regions betweengenes. SNPs within a coding sequence will not necessarily change theamino acid sequence of the protein that is produced, due to degeneracyof the genetic code. A SNP in which both forms lead to the samepolypeptide sequence is termed “synonymous” (sometimes referred to asilent mutation). If a different polypeptide sequence is produced, theyare termed “non-synonymous.” A non-synonymous change may either bemis-sense or nonsense, where a mis-sense change results in a differentamino acid and a nonsense change results in a premature stop codon. SNPsthat are not in protein-coding regions may still have consequences fore.g. gene splicing, transcription factor binding, or the sequence ofnon-coding RNA (e.g. affecting transcript stability, translation). SNPsare usually biallelic and thus easily assayed in plants and animals.

A particularly useful assays for detection of SNP markers is for exampleKBioscience Competitive Allele-Specific PCR (KASP, seewww.kpbioscience.co.uk), For developing the KASP-assay 70 base pairsupstream and 70 basepairs downstream of the SNP are selected and twoallele-specific forward primers and one allele specific reverse primeris designed. See e.g. Allen et al. 2011, Plant Biotechnology J. 9,1086-1099, especially p 1097-1098 for KASP assay method.

The terms “linked to” or “linkage”, as used herein, refers to ameasurable probability that genes or markers located on a givenchromosome are being passed on together to individuals in the nextgeneration. Thus, the term “linked” may refer to one or more genes ormarkers that are passed together with a gene with a probability greaterthan 0.5 (which is expected from independent assortment wheremarkers/genes are located on different chromosomes). Because theproximity of two genes or markers on a chromosome is directly related tothe probability that the genes or markers will be passed together toindividuals in the next generation, the term “linked” may also referherein to one or more genes or markers that are located within about 50centimorgan (cM) or less of one another on the same chromosome. Geneticlinkage is usually expressed in terms of cM. Centimorgan is a unit ofrecombinant frequency for measuring genetic linkage, defined as thatdistance between genes or markers for which one product of meiosis in100 is recombinant, or in other words, the centimorgan is equal to a 1%chance that a marker at one genetic locus on a chromosome will beseparated from a marker at a second locus due to crossing over in asingle generation. It is often used to infer distance along achromosome. The number of basepairs to which cM correspond varies widelyacross the genome (different regions of a chromosome have differentpropensities towards crossover) and the species (i.e. the total size ofthe genome).

The presently described Rf locus on chromosome 1B was mapped to asegment at chromosome 1B, in an interval of about 15.8 cM, said intervalbeing flanked by markers of SEQ ID NO 2 and SEQ ID NO 8. These and anymarker located in between can be said to comprise an allele that islinked to functional restorer gene for wheat G-type cytoplasmic malesterility located on chromosome 1B Thus, in this respect, the termlinked can be a separation of about 15.8 cM, or less such as about 12.5cm, about 10 cM, 7.5 cM, about 6 cM, about 5 cM, about 4 cM, about 3 cM,about 2.5 cM, about 2 cM, or even less. Particular examples of markerscomprising an allele linked to the functional restorer gene for wheatG-type cytoplasmic male sterility located on chromosome 1B are specifiedin table 1. The peak marker was the marker of SEQ ID NO. 6.

Further finemapping narrowed the 1B region to an interval of about 1.25cM (from 6.8 to 8.05 cM), comprising the markers as represented by SEQID NO. 11, SEQ ID NO. 12, SEQ ID NO. 12 and SEQ ID NO. 14. These and anyfurther marker located in said interval can be said to comprise anallele that is “tightly linked” to the functional restorer gene forwheat G-type cytoplasmic male sterility located on chromosome 1B. Thus,the term “tightly linked” as used herein can be a separation of about1.25 cM, or even less, such as about, 1.0 cM, about 0.95 cM, about 0.9cM, about 0.85 cM, about 0.8 cM, about 0.75 cM, about 0.5 cM, about 0.4cM, about 0.3 cM, about 0.25 cM, about 0.20 cM, about 0.15 cM, about0.10 cM, or even less. Particular examples of markers comprising anallele tightly linked to the functional restorer gene for wheat G-typecytoplasmic male sterility located on chromosome 1B are given in table2. The marker closest to the peak was SEQ ID NO. 13.

Thus, said at least one marker allele linked to said functional restorergene allele located on chromosome 1B can be selected from any one of:

-   -   a. a T at SEQ ID NO: 2;    -   b. a C at SEQ ID NO: 3;    -   c. a T at SEQ ID NO: 4;    -   d. a T at SEQ ID NO: 5;    -   e. an A at SEQ ID NO: 6;    -   f. an A at SEQ ID NO: 7;    -   g. a G at SEQ ID NO: 8;    -   h. a C at SEQ ID NO: 11;    -   i. an A at SEQ ID NO: 12,    -   j. a T at SEQ ID NO: 13;    -   k. a T at SEQ ID NO: 14,        or any combination thereof.

As used herein, “a T at SEQ ID NO: 2” or “a C at SEQ ID NO. 3” and thelike, refers to a T or a C etc being present at a position correspondingto the position of the SNP in said SEQ ID NO, as e.g. indicated in table1 or 2. This can for example be determined by alignment of the genomicsequence with said SEQ ID NO. Thus, “a T at SEQ ID NO: 2” means “a T ata position corresponding to position 51 of SEQ ID NO: 2”, etc.

In a further embodiment, said at least one marker allele localises to aninterval from 6.8 to 8.05 cM on chromosome 1B. Said 1.25 cM intervalcomprises the markers of SEQ ID NO. 11, SEQ ID NO. 12, SEQ ID NO. 13 andSEQ ID NO. 14 at the positions as indicated in table 2.

For example, said at least one marker allele linked to said functionalrestorer gene allele can be selected from any one of:

-   -   a. a C at SEQ ID NO: 11;    -   b. an A at SEQ ID NO: 12,    -   c. a T at SEQ ID NO: 13;    -   d. a T at SEQ ID NO: 14,        or any combination thereof.

In an even further embodiment, said at least one marker allele linked tosaid functional restorer gene for wheat G-type cytoplasmic malesterility located on chromosome 1B localises to an interval of 0.95 cM(from 7.1 to 8.05 cM) on chromosome 1B flanked by and comprising themarker pair of SEQ ID NO. 11 and SEQ ID NO. 14.

In a particular embodiment, said at least one marker allele linked tosaid functional restorer gene allele is a T at SEQ ID NO. 13.

The term “interval” refers to a continuous linear span of chromosomalDNA with termini defined by map position and/or markers. For example,the interval comprising and flanked by the marker pair of SEQ ID NO: 11and SEQ ID NO: 14. comprises the specifically mentioned flanking markersand the markers located in between, e.g. SEQ ID NO: 12 and 13 as listedin table 2 below. The interval comprising and flanked by the marker pairof SEQ ID NO: 2 and SEQ ID NO: 8 comprises the markers of SEQ ID NO: 3to 7 as well as the markers of SEQ ID NO: 11-14. Accordingly, a flankingmarker as used herein, is a marker that defines one of the termini of aninterval (and is included in that interval). It will be clear that anyof such intervals may comprise further markers not specificallymentioned herein.

The position of the chromosomal segments identified, and the markersthereof, when expressed as recombination frequencies or map units, areprovided herein as a matter of general information. The embodimentsdescribed herein were obtained using particular wheat populations.Accordingly, the positions of particular segments and markers as mapunits are expressed with reference to the used populations. It isexpected that numbers given for particular segments and markers as mapunits may vary from cultivar to cultivar and are not part of theessential definition of the DNA segments and markers, which DNA segmentsand markers are otherwise described, for example, by nucleotidesequence.

A locus (plural loci), as used herein refers to a certain place orposition on the genome, e.g. on a chromosome or chromosome arm, wherefor example a gene or genetic marker is found. A QTL (quantitative traitlocus), as used herein, and refers to a position on the genome thatcorresponds to a measurable characteristic, i.e. a trait, such as thepresently described Rf loci.

As used herein, the term “allele(s)”, such as of a gene, means any ofone or more alternative forms of a gene at a particular locus. In adiploid cell of an organism, alleles of a given gene are located at aspecific location or locus (loci plural) on a chromosome. One allele ispresent on each chromosome of the pair of homologous chromosomes orpossibly on homeologous chromosomes.

As used herein, the term “homologous chromosomes” means chromosomes thatcontain information for the same biological features and contain thesame genes at the same loci but possibly different alleles of thosegenes. Homologous chromosomes are chromosomes that pair during meiosis.“Non-homologous chromosomes”, representing all the biological featuresof an organism, form a set, and the number of sets in a cell is calledploidy. Diploid organisms contain two sets of non-homologouschromosomes, wherein each homologous chromosome is inherited from adifferent parent. In tetraploid species, two sets of diploid genomesexist, whereby the chromosomes of the two genomes are referred to as“homeologous chromosomes” (and similarly, the loci or genes of the twogenomes are referred to as homeologous loci or genes). Likewise,hexaploid species have three sets of diploid genomes, etc. A diploid,tetraploid or hexaploid plant species may comprise a large number ofdifferent alleles at a particular locus. The ploidy levels ofdomesticated wheat species range from diploid (Triticum monococcum,2n=14, AA), tetraploid (T. turgidum, 2n=28, AABB) to hexaploid (T.aestivum, 2n=42, AABBDD).

As used herein, the term “heterozygous” means a genetic conditionexisting when two different alleles reside at a specific locus, but arepositioned individually on corresponding pairs of homologous chromosomesin the cell. Conversely, as used herein, the term “homozygous” means agenetic condition existing when two identical alleles reside at aspecific locus, but are positioned individually on corresponding pairsof homologous chromosomes in the cell.

An allele of a particular gene or locus can have a particularpenetrance, i.e. it can be dominant, partially dominant, co-dominant,partially recessive or recessive. A dominant allele is a variant of aparticular locus or gene that when present in heterozygous form in anorganism results in the same phenotype as when present in homozygousform. A recessive allele on the other hand is a variant of an allelethat in heterozygous form is overruled by the dominant allele thusresulting in the phenotype conferred by the dominant allele, while onlyin homozygous form leads to the recessive phenotype. Partially dominant,co-dominant or partially recessive refers to the situation where theheterozygote displays a phenotype that is an intermediate between thephenotype of an organism homozygous for the one allele and an organismhomozygous for the other allele of a particular locus or gene. Thisintermediate phenotype is a demonstration of partial or incompletedominance or penetrance. When partial dominance occurs, a range ofphenotypes is usually observed among the offspring. The same applies topartially recessive alleles.

Cytoplasmic male-sterility is caused by one or more mutations in themitochondrial genome (termed “sterile cytoplasm”) and is inherited as adominant, maternally transmitted trait. For cytoplasmic male sterilityto be used in hybrid seed production, the seed parent must contain asterile cytoplasm and the pollen parent must contain (nuclear) restorergenes (Rf genes) to restore the fertility of the hybrid plants grownfrom the hybrid seed. Accordingly, also such Rf genes preferably are atleast partially dominant, most preferably dominant, in order to havesufficient restoring ability in offspring.

A chromosomal interval flanked by the above mentioned markers, are forexample the markers as listed in Table 1-2 below between thespecifically mentioned markers, or other markers that are not explicitlyshown, but which are also flanked by the marker pairs mentioned. Theskilled person can easily identify new markers in the genomic region orsubgenomic region being flanked by any of the marker pairs listed above.Such markers need not to be SNP markers, but can be any type ofgenotypic or phenotypic marker mapped to that genomic or subgenomicregion. Preferably such markers are genetically and physically linked tothe presently described Rf loci as present in (and as derivable from) atleast Accession number PI 583676 (USDA National Small GrainsCollection), but preferably also as present in other cereals comprisingthe Rf 1B locus. In other words, the markers are preferably indicativeof the presence of the Rf locus in a non-source specific manner.

In a further embodiment, at least two, three, four, or more markeralleles linked to said functional restorer gene for wheat G-typecytoplasmic male sterility located on chromosome 1B can be used, suchas, at least two, three, four, or more marker nucleic acids selectedfrom any one of SEQ IN NO. 2, SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5,SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 11, SEQ ID NO. 12,SEQ ID NO. 13, SEQ ID NO. 14.

In a further embodiment, at least two, three, four, or more contiguousmarker alleles linked to said functional restorer gene for wheat G-typecytoplasmic male sterility located on chromosome 1B may be used. Acontiguous marker, as used herein is a nucleotide sequence located“upstream” or “downstream” of another marker, depending on whether thecontiguous nucleotide sequence from the chromosome is on the 5′ or the3′ side of the original marker, as conventionally understood, e.g. inthe order as listed in table 1 or 2.

Integration of the fine map with partial genome sequences identifiedscaffold as represented by SEQ ID NO. 15 as harboring the functionalrestorer gene allele. Thus, in any of the herein described embodimentsor aspects, the functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B may localize to thegenome scaffold as represented by SEQ ID NO. 15.

A “contig”, as used herein refers to set of overlapping DNA segmentsthat together represent a consensus region of DNA. In bottom-upsequencing projects, a contig refers to overlapping sequence data(reads); in top-down sequencing projects, contig refers to theoverlapping clones that form a physical map of the genome that is usedto guide sequencing and assembly. Contigs can thus refer both tooverlapping DNA sequence and to overlapping physical segments(fragments) contained in clones depending on the context.

A “scaffold” as, used herein, refers to overlapping DNA contigs thattogether represent a consensus region of DNA.

In a further embodiment, said functional restorer gene allele is afunctional allele of a gene encoding a pentatricopeptide repeat (PPR)protein (i.e. a PPR gene) localising within any of the above intervalsor to said scaffold.

PPR proteins are classified based on their domain architecture. P-classPPR proteins possess the canonical 35 amino acid motif and normally lackadditional domains. Members of this class have functions in most aspectsof organelle gene expression. PLS-class PPR proteins have threedifferent types of PPR motifs, which vary in length; P (35 amino acids),L (long, 35-36 amino acids) and S (short, ˜31 amino acids), and membersof this class are thought to mainly function in RNA editing. Subtypes ofthe PLS class are categorized based on the additional C-terminal domainsthey possess (reviewed by Manna et al., 2015, Biochimie 113, p 93-99,incorporated herein by reference).

Most fertility restoration (Rf) genes come from a small clade of genesencoding pentatricopeptide repeat (PPR) proteins (Fuji et al., 2011,PNAS 108(4), 1723-1728—herein incorporated by reference). PPR genesfunctioning as fertility restoration (Rf) genes are referred to in Fuji(supra) as Rf-PPR genes. Rf-PPR genes are usually present in clusters ofsimilar Rf-PPR-like genes, which show a number of characteristicfeatures compared with other PPR genes. They are comprised primarily oftandem arrays of 15-20 PPR motifs, each composed of 35 amino acids.

Most Rf PPR genes belong to the P-class PPR subfamily, although alsoPLS-class PPR Rf genes have been identified, and are characterized bythe presence of tandem arrays of 15 to 20 PPR motifs each composed of 35amino acid residues. High substitution rates observed for particularamino acids within otherwise very conserved PPR motifs, indicatingdiversifying selection, prompted the conclusion that these residuesmight be directly involved in binding to RNA targets. This has led tothe development of a “PPR code” which allows the prediction of RNAtargets of naturally occurring PPR proteins as well as the design ofsynthetic PPR proteins that can bind RNA molecules of interest, wherebysequence specificity is ensured by distinct patterns of hydrogen bondingbetween each RNA base and the amino acid side chains at positions 5 and35 in the aligned PPR motif (Melonek et al., 2016, Nat Sci Report6:35152, Barkan et al., 2012, PLoS Genet 8(8): e1002910, bothincorporated herein by reference).

Accordingly, a functional allele of a PPR gene, as used herein, refersto an allele of a PPR gene that is a functional restorer gene allele forwheat G-type cytoplasmic male sterility as described herein, i.e. thatwhen expressed in a (sexually compatible) cereal plant has the capacityto restore fertility in the progeny of a cross with a G-type cytoplasmicmale sterile cereal plant. Such a functional allele of a PPR gene isalso referred to as a PPR-Rf gene (or Rf-PPR gene), which in turnencodes a PPR-Rf (or Rf-PPR) protein.

In one embodiment, said functional restorer gene allele encodes apolypeptide, such as a PPR protein that has the capacity to(specifically) bind to the CMS ORF256 (SEQ ID NO. 23). Bind to orspecifically bind to or (specifically) recognize, as used herein, meansthat according to the above described PPR code, the PPR protein containsa number of PPR motifs with specific residues at positions 5 and 35 andwhich are ordered in such a way so as to be able to bind to a targetmRNA, in this case the ORF256 mRNA, in a sequence-specific orsequence-preferential manner.

For example, the functional restorer gene allele can encode a PPRprotein containing PPR motifs with specific residues at the aboveindicated positions so as to recognize the target sequenceAACTGTTTCTATTTGCAC of ORF256 (nt 129-146 of SEQ ID NO. 23). In oneexample, the predicted recognition sequence can be AUUUKCASNCNYACGU (SEQID NO. 22).

In a further embodiment, said functional restorer gene allele is afunctional allele of a PPR gene encoded by SEQ ID NO. 16, SEQ ID NO 18,SEQ ID NO. 19, SEQ ID NO 21, or a PPR gene encoding the polypeptide ofSEQ ID NO. 17 or SEQ ID NO. 20. For example, said functional restorergene allele can comprise or encode a sequence that is substantiallyidentical to SEQ ID NO. 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO. 19,SEQ ID NO 20, SEQ ID NO. 21 as defined herein, such as at least 85%,85.5%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,99% or 99.5% identical to SEQ ID NO. 16, SEQ ID NO 17, SEQ ID NO 18, SEQID NO. 19, SEQ ID NO 20, SEQ ID NO. 21.

In a further embodiment, said functional restorer gene allele is afunctional restorer gene allele as present in (and as derivable from) atleast Accession number PI 583676 (USDA National Small Grains Collection,also known as Dekalb 582M and registered as US PVP 7400045).

In an even further embodiment, said functional restorer gene allelecomprises the nucleotide sequence of SEQ ID NO.16, SEQ ID NO 18, SEQ IDNO. 19, SEQ ID NO 21 or encodes the polypeptide of SEQ ID NO. 17 or SEQID NO.20.

It will be clear that when reference herein is made to a certain SNPgenotype or SNP allele (or marker genotype or marker allele) in aspecific genomic sequence (selected e.g. from SEQ ID NO: 1 to SEQ ID NO:14, or fragments thereof), this encompasses also the SNP genotype orallele in variants of the genomic sequence, i.e. the SNP genotype orallele in a genomic sequence that are homologous, e.g. comprising atleast 90%, 95%, 98%, 99% (substantial) sequence identity or more to thesequence referred to (selected e.g. from SEQ ID NO: 1 to SEQ ID NO: 14,or fragments thereof). Thus any reference herein to any one of SEQ IDNO: 1 to 14 (or fragments thereof) in one aspect also encompasses avariant of any one of SEQ ID NO: 1 to 14 (or fragments thereof), saidvariant (homologous sequence) comprising at least 85%, 90%, 95%, 98%,99% sequence identity or more to said sequence (using e.g. the program‘Needle’), but comprising said SNP (marker) genotype or allele.

The SNP genotype refers to two nucleotides, and genomic sequencescomprising one of these two nucleotides, one on each chromosome of thechromosome pair. So a plant having e.g. a AA genotype for SEQ ID NO. 6has an identical nucleotide (A) on both chromosomes at the positioncorresponding to nucleotide 32 of SEQ ID NO: 6, while a plant having anAG genotype for SEQ ID NO. 6 has one chromosome with an A at theposition corresponding to nucleotide 32 of SEQ ID NO: 6 and onechromosome with a G at said nucleotide position. Accordingly, a SNPallele refers to one of the two nucleotides of the SNP genotype aspresent on a chromosomes.

Based on the present disclosure, the skilled person can easily identifyany further Rf specific marker or marker alleles as listed above. Thiscan for example be done by sequencing genomic regions in-between any ofthe markers mentioned herein or by mapping new markers to a region inbetween any of the marker intervals or sub-intervals listed above.Preferably, but not necessarily, such markers are common markers, i.e.they are present on chromosome 1B of more than one Rf source.

The invention further describes a method for producing a cereal (e.g.wheat) plant comprising a functional restorer gene allele for wheatG-type cytoplasmic male sterility, comprising the steps of

-   -   a. crossing a first cereal plant comprising a functional        restorer gene for wheat G-type cytoplasmic male sterility        located on chromosome 1B, with a second plant (wherein said        first plant comprises at least one marker allele linked to a        functional restorer gene allele for wheat G-type cytoplasmic        male sterility located on chromosome 1B as described herein, and        hence is identifiable using the methods described herein)    -   b. identifying (and optionally selecting) a progeny plant        comprising a functional restorer gene allele for wheat G-type        cytoplasmic male sterility located on chromosome 1B according to        any of the methods described herein, by identifying a progeny        plant comprising at least one marker allele linked to a        functional restorer gene allele for wheat G-type cytoplasmic        male sterility located on chromosome 1B as described herein        (wherein said progeny plant comprises said functional restorer        gene for wheat G-type cytoplasmic male sterility located on        chromosome 1B).

Also provided is a method for producing a cereal plant comprising afunctional restorer gene allele for wheat G-type cytoplasmic malesterility located on chromosome 1B, comprising the steps of

-   -   a. crossing a first cereal plant homozygous for a functional        restorer gene for wheat G-type cytoplasmic male sterility        located on chromosome 1B with a second cereal plant (wherein        said first cereal plant comprises at least one marker allele        linked to a functional restorer gene allele for wheat G-type        cytoplasmic male sterility located on chromosome 1B as described        herein, preferably wherein said plant is homozygous for said at        least one marker allele)    -   b. obtaining a progeny plant, wherein said progeny plant        comprises a functional restorer gene allele for wheat G-type        cytoplasmic male sterility located on chromosome 1B (wherein        said progeny plant comprises at least one marker allele linked        to a functional restorer gene allele for wheat G-type        cytoplasmic male sterility located on chromosome 1B as described        herein, and hence is identifiable using the methods described        herein).

Said second plant can be a plant not comprising a functional restorergene for wheat G-type cytoplasmic male sterility located on chromosome1B.

In an even further embodiment, the invention provides a method forproducing F1 hybrid seeds or F1 hybrid plants, comprising the steps of:

-   -   a. Providing a male cereal (e.g. wheat) parent plant comprising        a functional restorer gene allele for wheat G-type cytoplasmic        male sterility located on chromosome 1B;    -   b. Crossing said male parent plant with a female cereal (e.g.        wheat) parent plant, wherein the female parent plant is a G-type        cytoplasmic male sterile cereal plant;    -   c. Optionally collecting hybrid seeds from said cross.

The F1 hybrid seeds and plants preferably comprise at least one markerallele linked to a functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B as described herein,and the F1 plants grown from the seeds are therefore fertile.Preferably, the male parent plant is thus homozygous for said afunctional restorer gene allele for wheat G-type cytoplasmic malesterility located on chromosome 1B and hence is also homozygous for saidat least one marker allele.

In the above method, the male parent plant used for crossing can beselected using any of the herein described methods for selecting acereal plant comprising a functional restorer gene for wheat G-typecytoplasmic male sterility. Accordingly, the male parent plant comprisesat least one marker allele linked to a functional restorer gene allelefor wheat G-type cytoplasmic male sterility located on chromosome 1B,preferably in homozygous form.

The invention also provides cereal plants, such as wheat plants,obtained by any of the above methods, said cereal plant comprising atleast one marker allele linked to the functional restorer gene allelefor wheat G-type cytoplasmic male sterility located on chromosome 1B.

Said at least one marker allele linked to the functional restorer geneallele for wheat G-type cytoplasmic male sterility located on chromosome1B may localize to the same chromosomal intervals or contigs and can beselected from the same groups as described above for the otherembodiments and aspect.

Also described is a cereal plant, plant part, plant cell or seedcomprising at least one functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B, said plantcomprising at least one marker allele linked to a functional restorergene allele for wheat G-type cytoplasmic male sterility located onchromosome 1B, wherein said at least one marker allele localises withinan interval on chromosome 1B comprising and flanked by the markers ofSEQ ID NO 2 and SEQ ID NO 8, preferably wherein said plant comprises atleast one of (such as one, two, three, four five, six, seven, eight,nine, ten or all of):

-   -   a. a T at SEQ ID NO: 2;    -   b. a C at SEQ ID NO: 3;    -   c. a T at SEQ ID NO: 4;    -   d. a T at SEQ ID NO: 5;    -   e. an A at SEQ ID NO: 6;    -   f. an A at SEQ ID NO: 7;    -   g. a G at SEQ ID NO: 8;    -   h. a C at SEQ ID NO: 11;    -   i. an A at SEQ ID NO: 12,    -   j. a T at SEQ ID NO: 13;    -   k. a T at SEQ ID NO: 14,

said plant not comprising any one or all of

-   -   l. an A at SEQ ID NO: 1;    -   m. a T at SEQ ID NO: 9.

Also described is cereal plant, plant part, plant cell or seedcomprising at least one functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B, said plantcomprising at least one marker allele linked to a functional restorergene allele for wheat G-type cytoplasmic male sterility located onchromosome 1B, wherein said at least one marker allele localises withinan interval on chromosome 1B comprising and flanked by the markers ofSEQ ID NO 11 and SEQ ID NO 14, preferably wherein said plant comprisesat least one of (such as one, two, three or all of):

-   -   a. a C at SEQ ID NO: 11;    -   b. an A at SEQ ID NO: 12,    -   c. a T at SEQ ID NO: 13;    -   d. a T at SEQ ID NO: 14,

said plant not comprising any one or all of

-   -   e. a T at SEQ ID NO: 2;    -   f. an A at SEQ ID NO: 8.

Also described are a cereal plant, plant part, plant cell or seedcomprising at least one functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B, said plantcomprising at least one marker allele linked to a functional restorergene allele for wheat G-type cytoplasmic male sterility located onchromosome 1B wherein said at least one marker allele localises withinan interval on chromosome 1B comprising and flanked by the markers ofSEQ ID NO 11 and SEQ ID NO 14, preferably wherein said plant comprisesat least one of (such as one, two, three or all of):

-   -   a. a C at SEQ ID NO: 11;    -   b. an A at SEQ ID NO: 12,    -   c. a T at SEQ ID NO: 13;    -   d. a T at SEQ ID NO: 14,

said plant not comprising any one or all of

-   -   e. a T at SEQ ID NO: 10;    -   f. a T at SEQ ID NO: 5.

In a further embodiment, any of the above plants plant part, plant cellor seeds comprises a T at SEQ ID NO. 13. In a further embodiment, saidplant comprising a T at SEQ ID NO 13, does not comprise any one or allof: a C at SEQ ID NO: 11; an A at SEQ ID NO: 12; a T at SEQ ID NO: 14.

Also provided are plant parts, plant cells and seed from the cerealplants according to the invention comprising said at least one markerallele and said functional restorer gene allele. The plants, plantparts, plant cells and seeds of the invention may also be hybrid plants,plant parts, plant cells or seeds.

Also provided is a method to determine the presence or absence orzygosity status of a functional restorer gene allele for wheat G-typecytoplasmic male sterility located on chromosome 1B in a biologicalsample of a cereal plant, comprising providing genomic DNA from saidbiological sample, and analysing said DNA for the presence or absence orzygosity status of at least one marker allele linked to a functionalrestorer gene for wheat G-type cytoplasmic male sterility located onchromosome 1B a described herein. It will be clear that the presence canbe determined using a marker allele linked to the functional restorergene as described herein, whereas the absence can (additionally) bedetermined by detecting the presence of the other, non-restoring allele.The zygosity status, i.e. whether the plant is homozygous for therestorer allele, homozygous for the non-restorer allele or heterozygous(i.e. the Rf genotype), can be determined by detecting the presence orabsence of a marker allele linked to the functional restorer gene and bydetecting the presence of the other, non-restoring allele, but dependingon the parental origin it can also be sufficient to determine thepresence or absence of only one of the alleles to be able to deduce thecomplete genotype (zygosity status) of the plant.

Also provided is a method for the identification and/or selection of acereal (e.g. wheat) plant comprising a functional restorer gene allelefor wheat G-type cytoplasmic male sterility comprising the steps of

-   -   a. Identifying or detecting in said plant the presence of the        nucleic acid or the polypeptide encoding a functional restorer        gene for wheat G-type cytoplasmic male sterility as described        herein    -   b. and optionally selecting said plant comprising said nucleic        acid or polypeptide.

Likewise, identifying or detecting can involve obtaining a biologicalsample (e.g. protein) or genomic DNA and determining the presence of thenucleic acid or polypeptide according to methods well known in the art,such as hybridization, PCR, Rt-PCR, Southern blotting,Southern-by-sequencing, SNP detection methods (e.g. as describedherein), western blotting, elisa etc, e.g. based on the sequencesprovided herein.

The invention also provides the use of at least one marker comprising anallele linked to the functional restorer gene for wheat G-typecytoplasmic male sterility located on chromosome 1B for theidentification of at least one further marker comprising an allelelinked to said functional restorer gene for wheat G-type cytoplasmicmale sterility located on chromosome 1B. Such markers are alsogenetically linked or tightly linked to the restorer gene, and are alsowithin the scope of the invention. Markers can be identified by any of avariety of genetic or physical mapping techniques. Methods ofdetermining whether markers are genetically linked to a restore gene areknown to those of skill in the art and include, for example, intervalmapping (Lander and Botstein, (1989) Genetics 121:185), regressionmapping (Haley and Knott, (1992) Heredity 69:315) or MQM mapping(Jansen, (1994) Genetics 138:871), rMQM mapping. In addition, suchphysical mapping techniques as chromosome walking, contig mapping andassembly, amplicon resequencing, transcriptome sequencing, targetedcapture and sequencing, next generation sequencing and the like, can beemployed to identify and isolate additional sequences useful as markersin the context of the present invention.

The invention further provides the use of at least one marker allelelinked to a functional restorer gene for wheat G-type cytoplasmic malesterility located on chromosome 1B as described herein for theidentification of a plant a comprising said functional restorer gene forwheat G-type cytoplasmic male sterility.

Also provided is the use of a plant obtained by any of the methods asdescribed herein and comprising at least one marker allele linked to afunctional restorer gene for wheat G-type cytoplasmic male sterilitylocated on chromosome 1B as described herein, for restoring fertility ina progeny of a G-type cytoplasmic male sterile cereal plant, such as awheat plant, or for producing a population of hybrid cereal plants, suchas a wheat plants or for producing hybrid seed.

Further provided is a method for identifying a functional restorer geneallele for wheat G-type cytoplasmic male sterility located on chromosome1B, comprising the steps of

-   -   a. Providing a population of F2 plants resulting from selfing of        a population of F1 plants obtained by crossing a female cereal        parent plant with a male cereal parent plant, wherein the female        parent plant is a G-type cytoplasmic male sterile cereal plant,        and wherein the male parent plant comprises a functional        restorer gene allele for wheat G-type cytoplasmic male sterility        located on chromosome 1B.    -   b. Classifying the fertility of a plurality of said F2 plants.    -   c. Determining the nucleotide sequence of at least part of the        region of chromosome 1B comprising and flanked by the markers of        SEQ ID NO 2 and SEQ ID NO 8 of genomic DNA isolated from each of        said plurality of F2 plants.    -   d. Identifying the coding sequence within said region having the        highest association to the phenotype of restored fertility,        wherein the identified coding sequence is the functional        restorer gene allele for wheat G-type cytoplasmic male sterility        located on chromosome 1B.

In any of the above described methods or uses, the markers and markeralleles can localize to the same chromosomal intervals and can beselected from the same groups as described above for the otherembodiments and aspect.

Also provided are any of the markers comprising an allele linked to thefunctional restorer gene for wheat G-type cytoplasmic male sterilitylocated on chromosome 1B, as described herein.

Also provided herein is a chromosome fragment, which comprises afunctional restorer gene for wheat G-type cytoplasmic male sterilitylocated on chromosome 1B, as described throughout the specification. Inone aspect the chromosome fragment is isolated from its naturalenvironment. In another aspect it is in a plant cell, especially in acereal cell, especially in a wheat cell. Also an isolated part of thechromosome fragment comprising the functional restorer gene for wheatG-type cytoplasmic male sterility located on chromosome 1B is providedherein. Such a chromosome fragment can for example be a contig or ascaffold, such as corresponding to SEQ ID NO. 16.

Further provided is a recombinant nucleic acid molecule, especially arecombinant DNA molecule, which comprises a functional restorer geneaccording to the invention. In one aspect the functional restorer geneis detectable by one or more of the molecular marker assays describedherein. Also a DNA vector is provided comprising the recombinant DNA.The recombinant DNA molecule or DNA vector may be an isolated nucleicacid molecule. The DNA comprising the functional restorer gene may be ina microorganism, such as a bacterium (e.g. Agrobacterium or E. coli).

Thus, in one embodiment, the invention provides an (isolated) nucleicacid molecule encoding a functional restorer gene allele for wheatG-type cytoplasmic male sterility, wherein said functional restorer geneallele localises within an interval on chromosome 1B comprising andflanked by the markers of SEQ ID NO 2 and SEQ ID NO 8. Thus, the(isolated) nucleic acid molecule encodes or comprises a functionalrestorer gene allele for wheat G-type cytoplasmic male sterility that isderivable or derived from an interval on chromosome 1B comprising andflanked by the markers of SEQ ID NO 2 and SEQ ID NO 8. Said functionalrestorer gene allele can be identified and hence is identifiable usingany of the markers and marker alleles linked to said functional restorergene allele as described herein.

In a further embodiment, said functional restorer gene allele encoded bysaid (isolated) nucleic acid molecule localizes within an interval onchromosome 1B comprising and flanked by the markers of SEQ ID NO 11 andSEQ ID NO 14.

In a further embodiment, said functional restorer gene allele encoded bysaid (isolated) nucleic acid molecule localizes to the contig asrepresented by SEQ ID NO 15.

In a further embodiment, said functional restorer gene allele encoded bysaid (isolated) nucleic acid molecule can be a functional allele of aPPR gene localising within any of said intervals or to said contig.

In one embodiment, said (isolated) nucleic acid encoding said functionalrestorer gene allele encodes a(n) (isolated) polypeptide, such as a PPRprotein, that has the capacity to (specifically) bind to the CMS ORF256(SEQ ID NO. 22). Bind to or specifically bind to or (specifically)recognize, as used herein, means that according to the above describedPPR code, the PPR protein contains a number of PPR motifs with specificresidues at positions 5 and 35 and which are ordered in such as way soas to be able to bind to a target mRNA, in this case the ORF256 mRNA, ina sequence-specific or sequence-preferential manner.

For example, the functional restorer gene allele can encode a(n)(isolated) PPR protein containing PPR motifs with specific residues atthe above indicated positions so as to recognize the target sequenceAACTGTTTCTATTTGCAC of ORF256 (nt 129-146 of SEQ ID NO. 23). In oneexample, the predicted recognition sequence can be AUUUKCASNCNYACGU (SEQID NO. 21).

The functional restorer gene allele can also encode a PPR protein whichwhen expressed is targeted to the mitochondrion. This can e.g. beaccomplished by the presence of a (plant-functional) mitochondrialtargeting sequence or mitochondrial signal peptide, or mitochondrialtransit peptide. A mitochondrial targeting signal is a 10-70 amino acidlong peptide that directs a newly synthesized protein to themitochondria, typically found at the N-terminus. Mitochondrial transitpeptides are rich in positively charged amino acids but usually lacknegative charges. They have the potential to form amphipathic a-helicesin nonaqueous environments, such as membranes. Mitochondrial targetingsignals can contain additional signals that subsequently target theprotein to different regions of the mitochondria, such as themitochondrial matrix. Like signal peptides, mitochondrial targetingsignals are cleaved once targeting is complete. Mitochondrial Transitpeptides are e.g. described in Shewry and Gutteridge (1992, PlantProtein Engineering, 143-146, and references therein), Sjoling andGlaser (Trends Plant Sci Volume 3, Issue 4, 1 Apr. 1998, Pages 136-140),Pfanner (2000, Current Biol, Volume 10, Issue 11), Huang et al (2009,Plant Phys 150(3): 1272-1285), Chen et al. (1996, PNAS, Vol. 93, pp.11763-11768), Fuji et al. (Plant J 2016). In one example, such asequence can be aa 1-50 of SEQ ID NO. 20).

In a further embodiment, said functional restorer gene allele is afunctional allele of a PPR gene encoded by SEQ ID NO. 16, SEQ ID NO 18,SEQ ID NO. 19, SEQ ID NO 21, or a PPR gene encoding the polypeptide ofSEQ ID NO. 17 or SEQ ID NO. 20. For example, said functional restorergene allele can comprise or encode a sequence that is substantiallyidentical to SEQ ID NO. 16, SEQ ID NO 17, SEQ ID NO 18, SEQ ID NO.19,SEQ ID NO. 20, SEQ ID NO 21 as defined herein, such as at least 85%,85.5%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,99% or 99.5% identical to SEQ ID NO. 16, SEQ ID NO 17, SEQ ID NO 18, SEQID NO. 19, SEQ ID NO 20, SEQ ID NO. 21.

In an even further embodiment, said functional restorer gene allelecomprises the nucleotide sequence of SEQ ID NO.16, SEQ ID NO 18, SEQ IDNO. 19, SEQ ID NO 21 or encodes the polypeptide of SEQ ID NO. 17 or SEQID NO.20.

In a further embodiment, said functional restorer gene allele encoded bysaid (isolated) nucleic acid molecule is obtainable from USDA accessionnumber PI 583676.

Also provided is a(n) (isolated) polypeptide encoded by the nucleic acidmolecule as described above (said polypeptide encoding a functionalrestorer protein for wheat G-type cytoplasmic male sterility).

The functional restorer gene allele may also be cloned and a chimericgene may be made, e.g. by operably linking a plant expressible promoterto the functional restorer gene allele and optionally a 3′ end regioninvolved in transcription termination and polyadenylation functional inplants. Such a chimeric gene may be introduced into a plant cell, andthe plant cell may be regenerated into a whole plant to produce atransgenic plant. In one aspect the transgenic plant is a cereal plant,such as a wheat plant, according to any method well known in the art.

Thus, in a particular embodiment a chimeric gene is provided comprisinga(n) (isolated) nucleic acid molecule encoding the functional restorergene allele as described above, operably linked to a heterologousplant-expressible promoter and optionally a 3′ termination andpolyadenylation region.

The use of such a (isolated or extracted) nucleic acid molecule and/orof such a chimeric gene and/or of such a chromosome fragment forgenerating plant cells and plants comprising a functional restorer geneallele is encompassed herein. In one aspect it may be used to generatetransgenic cereal (e.g. wheat) cells, plants and plant parts or seedscomprising the functional restorer gene allele and the plant having thecapacity to restore fertility against wheat G-type cytoplasmic malesterility as described above.

A host or host cell, such as a (cereal_plant cell or (cereal) plant orseed thereof, such as a wheat plant cell or plant or seed thereof,comprising the (isolated) nucleic acid molecule, (isolated) polypeptide,or the chimeric gene as described above is provided, wherein preferablysaid polypeptide, said nucleic acid, or said chimeric gene in each caseis heterologous with respect to said plant cell or plant or seed. Thehost cell can e.g also be a bacterium, such as E. coli or Agrobacterium(tumefaciens).

Thus, also provided is a method for producing a cereal plant cell orplant or seed thereof, such as a wheat plant cell or plant or seedthereof, comprising a functional restorer gene for wheat G-typecytoplasmic male sterility, or for increasing restoration capacity forwheat G-type cytoplasmic male sterility (“CMS”) in a cereal plant, suchas a wheat plant, comprising the steps of providing said plant cell orplant with the (recombinant) chromosome fragment or the (isolated)nucleic acid molecule or the chimeric gene as described herein whereinsaid providing comprises transformation, crossing, backcrossing, genomeediting or mutagenesis. Restoration capacity, as used herein, means thecapacity of a plant to restore fertility in the progeny of a cross witha G-type cytoplasmic male sterility (“CMS”) line. Preferably, said plantexpresses or has increased expression of the polypeptide according tothe invention. Preferably, said (increase in) expression is at leastduring (the early phases of) pollen development and meiosis, such as inanther or, more specifically, tapetum, or developing microspores (wheresaid plant did not express or to a lesser extent expressed thepolypeptide prior to the providing step).

Thus, also provided is a method for producing a cereal plant cell orplant or seed thereof, such as a wheat plant cell or plant or seedthereof, comprising a functional restorer gene for wheat G-typecytoplasmic male sterility, or for increasing restoration capacity forwheat G-type cytoplasmic male sterility (“CMS”) in a cereal plant, suchas a wheat plant, comprising the steps of increasing the expression ofthe (isolated) polypeptide as described herein in said plant cell orplant or seed. Preferably, said (increase in) expression is at leastduring (the early phases of) pollen development and meiosis, such as inanther or, more specifically, tapetum, or developing microspores. Priorto the providing step said plant did not express or to a lesser extentexpressed the polypeptide and/or did not have or to a lesser extent hadrestoration capacity for wheat G-type cytoplasmic male sterility(“CMS”)).

Increasing the expression can be done by providing the plant with the(recombinant) chromosome fragment or the (isolated) nucleic acidmolecule or the chimeric gene as described herein, whereby the nucleicacid encoding the functional restorer gene allele is under the controlof appropriate regulatory elements such as a promoter driving expressionin the desired tissues/cells, but also by providing the plant withtranscription factors that e.g. (specifically) recognise the promoterregion and promote transcription, such as TALeffectors, dCas, dCpf1 etccoupled to transcriptional enhancers.

Further described is a method for converting a cereal plant, such as awheat plant, not having the capacity to restore fertility in the progenyof a cross with a G-type cytoplasmic male sterility (“CMS”) line (anon-restorer plant) into a plant having the capacity to restorefertility in the progeny of a cross with a G-type cytoplasmic malesterility (“CMS”) line (a restorer plant), comprising the steps ofmodifying the genome of said plant to comprise the (isolated) nucleicacid molecule or the chimeric gene encoding a functional restorer geneallele for wheat G-type cytoplasmic male sterility as described hereinwherein said modifying comprises transformation, crossing, backcrossing,genome editing or mutagenesis. Preferably, said plant expresses thepolypeptide according to the invention, particularly at least during(the early phases of) pollen development and meiosis, such as in antheror, more specifically, tapetum, or developing microspores. Prior to saidmodification said plant did not express or to a lesser extent expressedthe polypeptide and/or did not have or to a lesser extent hadrestoration capacity for wheat G-type cytoplasmic male sterility(“CMS”)).

Thus, also provided is a method for converting a non-restoring cerealplant, such as a wheat plant, into a restoring plant for wheat G-typecytoplasmic male sterility (“CMS”), or for increasing restorationcapacity for wheat G-type cytoplasmic male sterility (“CMS”) in a cerealplant, such as a wheat plant, comprising the steps of modifying thegenome of said plant to increase the expression of a polypeptideaccording to the invention in said plant. Preferably, said (increase in)expression is at least during (the early phases of) pollen developmentand meiosis such as in anther or, more specifically, tapetum, ordeveloping microspores. Prior to said modification said plant did notexpress or to a lesser extent expressed the polypeptide and/or did nothave or to a lesser extent had restoration capacity for wheat G-typecytoplasmic male sterility (“CMS”)).

Modifying the genome to increase expression of the polypeptide can forexample be done by modifying the native promoter to include regulatoryelements that increase transcription, such as certain enhancer element,but also by inactivating or removing certain negative regulatoryelements, such as repressor elements or target sites for miRNAs orlncRNAs. The Rf3 5′/upstream region including the promoter is includedin SEQ ID NO 21, e.g. as represented by nt 7907-11981 or fragmentsthereof.

Also described is a plant cell or plant, preferably a cereal plant cellor cereal plant or seed thereof, such as a wheat plant cell or plant orseed thereof, produced according to any of the above methods, preferablywherein said plant has an increased restoration capacity for wheatG-type cytoplasmic male sterility (“CMS”) compared to said plant priorto the providing step or the modification step. Use of such a plantobtained according to the above methods to restore fertility in theprogeny of a cross with a G-type cytoplasmic male sterility (“CMS”)plant or to produce hybrid plants or hybrid seed is also described. Sucha plant cell, plant or seed can be a hybrid plant cell, plant or seed.

Genome editing, as used herein, refers to the targeted modification ofgenomic DNA using sequence-specific enzymes (such as endonuclease,nickases, base conversion enzymes) and/or donor nucleic acids (e.g.dsDNA, oligo's) to introduce desired changes in the DNA.Sequence-specific nucleases that can be programmed to recognize specificDNA sequences include meganucleases (MGNs), zinc-finger nucleases(ZFNs), TAL-effector nucleases (TALENs) and RNA-guided or DNA-guidednucleases such as Cas9, Cpf1, CasX, CasY, C2c1, C2c3, certain argonoutsystems (see e.g. Osakabe and Osakabe, Plant Cell Physiol. 2015 March;56(3):389-400; Ma et al., Mol Plant. 2016 Jul. 6; 9(7):961-74; Bortesieet al., Plant Biotech J, 2016, 14; Murovec et al., Plant Biotechnol J.2017 Apr. 1; Nakade et al., Bioengineered 8-3, 2017; Burstein et al.,Nature 542, 37-241; Komor et al., Nature 533, 420-424, 2016; allincorporated herein by reference). Donor nucleic acids can be used as atemplate for repair of the DNA break induced by a sequence specificnuclease, but can also be used as such for gene targeting (without DNAbreak induction) to introduce a desired change into the genomic DNA.

Accordingly, using these technologies, plants lacking a functionalrestorer gene for wheat G-type cytoplasmic male sterility (non-restoringplants) can be converted to restoring plants by making the desiredchanges to existing PPR genes or alternatively to introduce one or morecomplete sequences encoding functional PPR Rf proteins, e.g. asdescribed herein, at a specific genomic location.

Mutagenesis as used herein, refers to e.g. EMS mutagenesis or radiationinduced mutagenesis and the like.

Thus, transgenic cereal cells, e.g. transgenic wheat cells, comprisingin their genome a recombinant chromosome fragment as described or an(isolated) nucleic acid molecule as described or a chimeric gene asdescribed comprising a functional restorer gene allele as described arealso an embodiment of the invention. In one aspect the DNA moleculecomprising Rf allele is stably integrated into the cereal (e.g. wheat)genome.

Thus, cereal plants, plant parts, plant cells, or seeds thereof,especially wheat, comprising a chromosome fragment or a nucleic acidmolecule according to the invention or a polypeptide according to theinvention or a chimeric gene according to the invention encoding afunctional restorer gene according to the invention, are provided, saidplant having the capacity to restore fertility against wheat G-typecytoplasmic male sterility are provided herein. In one embodiment, thechromosome fragment, nucleic acid molecule, polypeptide or chimeric geneis heterologous to the plant, such as transgenic cereal plants ortransgenic wheat plants. This also includes plant cells or cell culturescomprising such a chromosome fragment or nucleic acid molecule,polypeptide or chimeric gene, independent whether introduced bytransgenic methods or by breeding methods. The cells are e.g. in vitroand are regenerable into plants comprising the chromosome fragment orchimeric gene of the invention. Said plants, plant parts, plant cellsand seeds may also be hybrid plants, plant parts, plant cells or seeds.

Such plants may also be used as male parent plant in a method forproducing F1 hybrid seeds or F1 hybrid plants, as described above.

A plant-expressible promoter as used herein can be any promoter thatdrives sufficient expression at least during (early) pollen developmentand meisosis, such as in anther or, more specifically, tapetum, ordeveloping microspores. This can for example be a constitutive promoter,an inducible promoter, but also a pollen-, anther- or, more specificallytapetum- or microspore-specific/preferential promoter.

A constitutive promoter is a promoter capable of directing high levelsof expression in most cell types (in a spatio-temporal independentmanner). Examples of plant expressible constitutive promoters includepromoters of bacterial origin, such as the octopine synthase (OCS) andnopaline synthase (NOS) promoters from Agrobacterium, but also promotersof viral origin, such as that of the cauliflower mosaic virus (CaMV) 35Stranscript (Hapster et al., 1988, Mol. Gen. Genet. 212: 182-190) or 19SRNAs genes (Odell et al., 1985, Nature. 6; 313(6005):810-2; U.S. Pat.No. 5,352,605; WO 84/02913; Benfey et al., 1989, EMBO J. 8:2195-2202),the enhanced 2×355 promoter (Kay at al., 1987, Science 236:1299-1302;Datla et al. (1993), Plant Sci 94:139-149) promoters of the cassava veinmosaic virus (CsVMV; WO 97/48819, U.S. Pat. No. 7,053,205), 2×CsVMV(WO2004/053135) the circovirus (AU 689 311) promoter, the sugarcanebacilliform badnavirus (ScBV) promoter (Samac et al., 2004, TransgenicRes. 13(4):349-61), the figwort mosaic virus (FMV) promoter (Sanger etal., 1990, Plant Mol Biol. 14(3):433-43), the subterranean clover viruspromoter No 4 or No 7 (WO 96/06932) and the enhanced 35S promoter asdescribed in U.S. Pat. Nos. 5,164,316, 5,196,525, 5,322,938, 5,359,142and 5,424,200. Among the promoters of plant origin, mention will be madeof the promoters of the plant ribulose-biscarboxylase/oxygenase(Rubisco) small subunit promoter (U.S. Pat. No. 4,962,028; WO99/25842)from Zea mays and sunflower, the promoter of the Arabidopsis thalianahistone H4 gene (Chabouté et al., 1987), the ubiquitin promoters(Holtorf et al., 1995, Plant Mol. Biol. 29:637-649, U.S. Pat. No.5,510,474) of Maize, Rice and sugarcane, the Rice actin 1 promoter(Act-1, U.S. Pat. No. 5,641,876), the histone promoters as described inEP 0 507 698 A1, the Maize alcohol dehydrogenase 1 promoter (Adh-1)(from http://www.patentlens.net/daisy/promoters/242.html)). Also thesmall subunit promoter from Chrysanthemum may be used if that use iscombined with the use of the respective terminator (Outchkourov et al.,Planta, 216: 1003-1012, 2003).

Pollen/microspore-active promoters include e.g. a maize pollen specificpromoter (see, e.g., Guerrero (1990) Mol. Gen. Genet. 224:161 168),PTA29, PTA26 and PTAI 3 (e.g., see U.S. Pat. No. 5,792,929) and asdescribed in e.g. Baerson et al. (1994 Plant Mol. Biol. 26: 1947-1959),the NMT19 microspore-specific promoter as e.g. described in WO97/30166.Further anther/pollen-specific or anther/pollen-active promoters aredescribed in e.g. Khurana et al., 2012 (Critical Reviews in PlantSciences, 31: 359-390), WO2005100575, WO 2008037436. Other suitablepromoters are e.g the barley vrn1 promoter, such as described inAlonso-Peral et al. (2001, PLoS One. 2011; 6(12):e29456).

It will be clear that the herein identified nucleic acids andpolypeptides encoding functional restorer genes can be used to identifyfurther functional restorer genes for wheat G-type cytoplasmic malesterility. Thus, the invention also provides the use of the (isolated)nucleic acids or polypeptides as disclosed herein, such as SEQ ID NO. 16or 17, to identify one or more further functional restorer genes forwheat G-type cytoplasmic male sterility.

Further, homologous or substantially identical functional restorer genescan be identified using methods known in the art. Homologous nucleotidesequence may be identified and isolated by hybridization under stringentor high stringent conditions using as probes a nucleic acid comprisinge.g. the nucleotide sequence of SEQ ID NO: 16 or part thereof, asdescribed above. Other sequences encoding functional restorer genes mayalso be obtained by DNA amplification using oligonucleotides specificfor genes encoding functional restorer genes as primers, such as but notlimited to oligonucleotides comprising or consisting of about 20 toabout 50 consecutive nucleotides from SEQ ID NO: 16 or its complement.Homologous or substantially identical functional restorer genes can beidentified in silico using Basic Local Alignment Search Tool (BLAST)homology search with the nucleotide or amino acid sequences as providedherein.

Functionality of restorer genes or alleles thereof, such as identifiedas above, can be validated for example by providing, e.g. bytransformation or crossing, such a restorer gene under control of aplant-expressible promoter in a cereal (wheat) plant that does not havethe capacity to restore fertility of offspring of a G-type cytoplasmicmale sterile wheat plant, crossing the thus generated cereal plant witha G-type cytoplasmic male sterile wheat plant and evaluating seed set inthe progeny. Alternatively, a restorer line can be transformed with anRNAi construct or gene-edited with e.g. CRISPR-Cas technology or anyother sequence specific nuclease so to generate a loss of function thatrenders the plant non-restoring. Similarly, other means for mutating therestorer gene (e.g. EMS, g-radiation) can be used to evaluate the effectof a loss of function mutation on restoring ability.

In any of the herein described embodiments and aspects the plant maycomprise or may be selected to comprise or may be provided with afurther functional restorer gene for wheat G-type cytoplasmic malesterility (located on or obtainable from the same or anotherchromosome), such as Rf1 (1A), Rf2 (7D), Rf4 (6B), Rf5 (6D), Rf6 (5D),Rf7 (7B), Rf8, 6AS or 6BS (Tahir & Tsunewaki, 1969; Yen et al., 1969;Bahl & Maan, 1973; Du et al., 1991; Sihna et al., 2013; Ma et al., 1991;Zhou et al., 2005).

Any of the herein described methods, markers and marker alleles, nucleicacids, polypeptides, chimeric genes, plants etc may also be used torestore fertility against S^(v)-type cytoplasm, as e.g. described inAhmed et al 2001 (supra).

As used herein a “chimeric gene” refers to a nucleic acid constructwhich is not normally found in a plant species. A chimeric nucleic acidconstruct can be DNA or RNA. “Chimeric DNA construct” and “chimericgene” are used interchangeably to denote a gene in which the promoter orone or more other regulatory regions, such as the a transcriptiontermination and polyadenylation region of the gene are not associated innature with part or all of the transcribed DNA region, or a gene whichis present in a locus in the plant genome in which it does not occurnaturally or present in a plant in which it does not naturally occur. Inother words, the gene and the operably-linked regulatory region or thegene and the genomic locus or the gene and the plant are heterologouswith respect to each other, i.e. they do not naturally occur together.

A first nucleotide sequence is “operably linked” with a second nucleicacid sequence when the first nucleic acid sequence is in a functionalrelationship with the second nucleic acid sequence. For example, apromoter is operably linked to a coding sequence if the promoter affectsthe transcription or expression of the coding sequence. Whenrecombinantly produced, operably linked nucleic acid sequences aregenerally contiguous, and, where necessary to join two protein-codingregions, in the same reading frame (e.g., in a polycistronic ORF).However, nucleic acids need not be contiguous to be operably linked.

“Backcrossing” refers to a breeding method by which a (single) trait,such as fertility restoration (Rf), can be transferred from one geneticbackground (a “donor”) into another genetic background (also referred toas “recurrent parent”), e.g. a plant not comprising such an Rf gene orlocus. An offspring of a cross (e.g. an F1 plant obtained by crossing anRf containing with an Rf lacking plant; or an F2 plant or F3 plant,etc., obtained from selfing the F1) is “backcrossed” to the parent.After repeated backcrossing (BC1, BC2, etc.) and optionally selfings(BC1S1, BC2S1, etc.), the trait of the one genetic background isincorporated into the other genetic background.

“Marker assisted selection” or “MAS” is a process of using the presenceof molecular markers, which are genetically linked to a particular locusor to a particular chromosome region (e.g. introgression fragment), toselect plants for the presence of the specific locus or region(introgression fragment). For example, a molecular marker geneticallyand physically linked to an Rf locus, can be used to detect and/orselect plants comprising the Rf locus. The closer the genetic linkage ofthe molecular marker to the locus, the less likely it is that the markeris dissociated from the locus through meiotic recombination.

“LOD-score” (logarithm (base 10) of odds) refers to a statistical testoften used for linkage analysis in animal and plant populations. The LODscore compares the likelihood of obtaining the test data if the two loci(molecular markers loci and/or a phenotypic trait locus) are indeedlinked, to the likelihood of observing the same data purely by chance.Positive LOD scores favor the presence of linkage and a LOD scoregreater than 3.0 is considered evidence for linkage. A LOD score of +3indicates 1000 to 1 odds that the linkage being observed did not occurby chance.

A “biological sample” can be a plant or part of a plant such as a planttissue or a plant cell.

“Providing genomic DNA” as used herein refers to providing a samplecomprising genomic DNA from the plant. The sample can refer to a tissuesample which has been obtained from said plant, such as, for example, aleaf sample, comprising genomic DNA from said plant. The sample canfurther refer to genomic DNA which is obtained from a tissue sample,such as genomic DNA which has been obtained from a tissue, such as aleaf sample. Providing genomic DNA can include, but does not need toinclude, purification of genomic DNA from the tissue sample. Providinggenomic DNA thus also includes obtaining tissue material from a plant orlarger piece of tissue and preparing a crude extract or lysatetherefrom.

“Isolated DNA” as used herein refers to DNA not occurring in its naturalgenomic context, irrespective of its length and sequence. Isolated DNAcan, for example, refer to DNA which is physically separated from thegenomic context, such as a fragment of genomic DNA. Isolated DNA canalso be an artificially produced DNA, such as a chemically synthesizedDNA, or such as DNA produced via amplification reactions, such aspolymerase chain reaction (PCR) well-known in the art. Isolated DNA canfurther refer to DNA present in a context of DNA in which it does notoccur naturally. For example, isolated DNA can refer to a piece of DNApresent in a plasmid. Further, the isolated DNA can refer to a piece ofDNA present in another chromosomal context than the context in which itoccurs naturally, such as for example at another position in the genomethan the natural position, in the genome of another species than thespecies in which it occurs naturally, or in an artificial chromosome.

Whenever reference to a “plant” or “plants” according to the inventionis made, it is understood that also plant parts (cells, tissues ororgans, seed pods, seeds, severed parts such as roots, leaves, flowers,pollen, etc.), progeny of the plants which retain the distinguishingcharacteristics of the parents (especially the restoring capacity), suchas seed obtained by selfing or crossing, e.g. hybrid seed (obtained bycrossing two inbred parental lines), hybrid plants and plant partsderived there from are encompassed herein, unless otherwise indicated.

In some embodiments, the plant cells of the invention may benon-propagating cells.

The obtained plants according to the invention can be used in aconventional breeding scheme to produce more plants with the samecharacteristics or to introduce the characteristic of the presence ofthe restorer gene according to the invention in other varieties of thesame or related plant species, or in hybrid plants. The obtained plantscan further be used for creating propagating material. Plants accordingto the invention can further be used to produce gametes, seeds, flour,embryos, either zygotic or somatic, progeny or hybrids of plantsobtained by methods of the invention. Seeds obtained from the plantsaccording to the invention are also encompassed by the invention.

“Creating propagating material”, as used herein, relates to any meansknow in the art to produce further plants, plant parts or seeds andincludes inter alia vegetative reproduction methods (e.g. air or groundlayering, division, (bud) grafting, micropropagation, stolons orrunners, storage organs such as bulbs, corms, tubers and rhizomes,striking or cutting, twin-scaling), sexual reproduction (crossing withanother plant) and asexual reproduction (e.g. apomixis, somatichybridization).

Transformation, as used herein, means introducing a nucleotide sequenceinto a plant in a manner to cause stable or transient expression of thesequence. Transformation and regeneration of both monocotyledonous anddicotyledonous plant cells is now routine, and the selection of the mostappropriate transformation technique will be determined by thepractitioner. The choice of method will vary with the type of plant tobe transformed; those skilled in the art will recognize the suitabilityof particular methods for given plant types. Suitable methods caninclude, but are not limited to: electroporation of plant protoplasts;liposome-mediated transformation; polyethylene glycol (PEG) mediatedtransformation; transformation using viruses; micro-injection of plantcells; micro-projectile bombardment of plant cells; vacuum infiltration;and Agrobacterium-mediated transformation.

As used herein, the term “homologous” or “substantially identical” mayrefer to nucleotide sequences that are more than 85% identical. Forexample, a substantially identical nucleotide sequence may be 85.5%;86%; 87%; 88%; 89%; 90%; 91%; 92%; 93%; 94%; 95%; 96%; 97%; 98%; 99% or99.5% identical to the reference sequence. A probe may also be a nucleicacid molecule that is “specifically hybridizable” or “specificallycomplementary” to an exact copy of the marker to be detected (“DNAtarget”). “Specifically hybridizable” or “specifically complementary”are terms that indicate a sufficient degree of complementarity such thatstable and specific binding occurs between the nucleic acid molecule andthe DNA target. A nucleic acid molecule need not be 100% complementaryto its target sequence to be specifically hybridizable. A nucleic acidmolecule is specifically hybridizable when there is a sufficient degreeof complementarity to avoid non-specific binding of the nucleic acid tonon-target sequences under conditions where specific binding is desired,for example, under stringent hybridization conditions, preferably highlystringent conditions.

“Stringent hybridization conditions” can be used to identify nucleotidesequences, which are homologous or substantially identical to a givennucleotide sequence. Stringent conditions are sequence dependent andwill be different in different circumstances. Generally, stringentconditions are selected to be about 5° C. lower than the thermal meltingpoint (T_(m)) for the specific sequences at a defined ionic strength andpH. The T_(m) is the temperature (under defined ionic strength and pH)at which 50% of the target sequence hybridizes to a perfectly matchedprobe. Typically stringent conditions will be chosen in which the saltconcentration is about 0.02 molar at pH 7 and the temperature is atleast 60° C. Lowering the salt concentration and/or increasing thetemperature increases stringency. Stringent conditions for RNA-DNAhybridizations (Northern blots using a probe of e.g. 100 nt) are forexample those which include at least one wash in 0.2×SSC at 63° C. for20 min, or equivalent conditions.

“High stringency conditions” can be provided, for example, byhybridization at 65° C. in an aqueous solution containing 6×SSC (20×SSCcontains 3.0 M NaCl, 0.3 M Na-citrate, pH 7.0), 5×Denhardt's(100×Denhardt's contains 2% Ficoll, 2% Polyvinyl pyrollidone, 2% BovineSerum Albumin), 0.5% sodium dodecyl sulphate (SDS), and 20 μg/mldenaturated carrier DNA (single-stranded fish sperm DNA, with an averagelength of 120-3000 nucleotides) as non-specific competitor. Followinghybridization, high stringency washing may be done in several steps,with a final wash (about 30 min) at the hybridization temperature in0.2-0.1×SSC, 0.1% SDS.

“Moderate stringency conditions” refers to conditions equivalent tohybridization in the above described solution but at about 60-62° C.Moderate stringency washing may be done at the hybridization temperaturein 1×SSC, 0.1% SDS.

“Low stringency” refers to conditions equivalent to hybridization in theabove described solution at about 50-52° C. Low stringency washing maybe done at the hybridization temperature in 2×SSC, 0.1% SDS. See alsoSambrook et al. (1989) and Sambrook and Russell (2001).

For the purpose of this invention, the “sequence identity” of tworelated nucleotide or amino acid sequences, expressed as a percentage,refers to the number of positions in the two optimally aligned sequenceswhich have identical residues (×100) divided by the number of positionscompared. A gap, i.e., a position in an alignment where a residue ispresent in one sequence but not in the other, is regarded as a positionwith non-identical residues. The “optimal alignment” of two sequences isfound by aligning the two sequences over the entire length according tothe Needleman and Wunsch global alignment algorithm (Needleman andWunsch, 1970, J Mol Biol 48(3):443-53) in The European Molecular BiologyOpen Software Suite (EMBOSS, Rice et al., 2000, Trends in Genetics16(6): 276-277; see e.g. http://www.ebi.ac.uk/emboss/align/index.html)using default settings (gap opening penalty=10 (for nucleotides)/10 (forproteins) and gap extension penalty=0.5 (for nucleotides)/0.5 (forproteins)). For nucleotides the default scoring matrix used is EDNAFULLand for proteins the default scoring matrix is EBLOSUM62. It will beclear that whenever nucleotide sequences of RNA molecules are defined byreference to nucleotide sequence of corresponding DNA molecules, thethymine (T) in the nucleotide sequence should be replaced by uracil (U).Whether reference is made to RNA or DNA molecules will be clear from thecontext of the application.

As used herein “comprising” is to be interpreted as specifying thepresence of the stated features, integers, steps or components asreferred to, but does not preclude the presence or addition of one ormore features, integers, steps or components, or groups thereof. Thus,e.g., a nucleic acid or protein comprising a sequence of nucleotides oramino acids, may comprise more nucleotides or amino acids than theactually cited ones, i.e., be embedded in a larger nucleic acid orprotein. A chimeric gene comprising a nucleic acid which is functionallyor structurally defined, may comprise additional DNA regions etc.

Unless stated otherwise in the Examples, all recombinant DNA techniquesare carried out according to standard protocols as described in Sambrooket al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition,Cold Spring Harbor Laboratory Press, NY and in Volumes 1 and 2 ofAusubel et al. (1994) Current Protocols in Molecular Biology, CurrentProtocols, USA. Standard materials and methods for plant molecular workare described in Plant Molecular Biology Labfax (1993) by R. D. D. Croy,jointly published by BIOS Scientific Publications Ltd (UK) and BlackwellScientific Publications, UK. Other references for standard molecularbiology techniques include Sambrook and Russell (2001) MolecularCloning: A Laboratory Manual, Third Edition, Cold Spring HarborLaboratory Press, NY, Volumes I and II of Brown (1998) Molecular BiologyLabFax, Second Edition, Academic Press (UK). Standard materials andmethods for polymerase chain reactions can be found in Dieffenbach andDveksler (1995) PCR Primer: A Laboratory Manual, Cold Spring HarborLaboratory Press, and in McPherson at al. (2000) PCR—Basics: FromBackground to Bench, First Edition, Springer Verlag, Germany.

All patents, patent applications, and publications or public disclosures(including publications on internet) referred to or cited herein areincorporated by reference in their entirety.

The sequence listing contained in the file named “BCS16-2008-WO1_ST25,which is 83 kilobytes (size as measured in Microsoft Windows®), contains26 sequences SEQ ID NO: 1 through SEQ ID NO: 26, is filed herewith byelectronic submission and is incorporated by reference herein.

The invention will be further described with reference to the examplesdescribed herein; however, it is to be understood that the invention isnot limited to such examples.

SEQUENCES

SEQ ID NO. 1-SEQ ID NO. 14: marker sequences (see table 1 and 2)

SEQ ID NO. 15: Contig containing Rf3-PPR

SEQ ID NO. 16: Coding sequence Rf3-PPR

SEQ ID NO. 17 Amino acid sequence Rf3-PPR

SEQ ID NO. 18: manually annotated mRNA Rf-PPR3

-   -   Nt 1-64: 5′ UTR    -   Nt 65-2437: CDS    -   Nt 2438-3438: 3′ UTR

SEQ ID NO. 19: manually annotated coding sequence Rf3-PPR

SEQ ID NO. 20: manually annotated amino acid sequence Rf3-PPR

SEQ ID NO. 21: Genome sequence Rf3-PPR

-   -   Nt 4468-5470: 3′UTR (complement)    -   Nt 5471-7843: CDS (complement)    -   Nt 7844-7907: 5′UTR (complement)

SEQ ID NO. 22: predicted RNA target

SEQ ID NO. 23: ORF256

-   -   Nt 84-857: CDS

SEQ ID NO. 24 Fw primer

SEQ ID NO. 25: Rev primer

SEQ ID NO. 26: Probe

EXAMPLES Example 1: Plant Materials and Genetic Mapping

A male sterile line carrying Triticum timopheevii CMS, CMS005, and amale sterile restorer line responding to Triticum timopheevii CMS(T.timopheevii /2* lowin //2* Quivira, Accession number PI 583676, USDANational Small Grains Collection, also known as Dekalb 582M andregistered as US PVP 7400045, available via the National Plant GermplasmSystemhttps://npgsweb.ars-grin.gov/gringlobal/accessiondetail.aspx?id=1478647),were used as parents to generate F1 progeny. The F1 progeny was selfedto generate an F2 population. The F2 population, consisting of 281individuals, was used for identification of the markers linked to therestorer locus. A genetic map with total of 2080 SNP markers wasestablished and covered all chromosomes of the wheat genome. Thechromosome 1B is described by 150 SNP markers.

Example 2: Fertility Classification and Coarse Mapping

The 276 plants in this F2 population were phenotypically classifiedaccording to seed set on the main, bagged head. Plants without seedsunder the bag were classified as sterile. Plants with seed set wereclassified as fertile. FIG. 1 details the number of F2s per amount ofseeds set on a single head for 2 different locations. 41 and 45 F2plants in the 2 locations, were classified as sterile. Fully sterile F2plants were noticed in the 2 locations.

Using a genetic map of 2080 SNP, QTL analysis was carried out usingHaley-Knott regression to test the effect of variation in seed setacross all markers. Significant marker-trait associations aredistinguished by -log-transformed p-values higher than 3. Such, aninterval of significantly associated markers was delineated, includingleft and right flanking markers (SEQ ID NO. 2 and SEQ ID NO. 8). Themarker with the highest significance and biggest effect on restorationis the peak marker of SEQ ID NO. 6 (as indicated by X in Table 1 below).An interval of significantly associated markers was delineated using thefollowing criteria: significance threshold at 2.5, significance drop at1.5 and significance drop between peaks at 2. This delimited theinterval to 15.8 cM for 1B by the left and right flanking markers (FIG.2).

TABLE 1 Markers in the interval with significance of marker-traitassociation and effect size on restoration (in number of seeds aboveaverage seed set in the entire population) on 1B. SEQ ID Rf donor SNPposition in Chrnomosome Significance Significance mean seed additiveeffect dominance effect phenotypic variance NO allele SEQ ID NO Position(cM) (−log10(p)) interval set on seed set on seed set explained 1 A 5135.206 7.51 31.68 10.05 1.02 0.13 2 T 51 35.753 8.2 x 31.4 10.51 1.320.14 3 C 51 38.165 10.49 x 31.26 11.8 1.85 0.18 4 T 51 38.37 10.51 x31.26 11.8 1.83 0.18 5 T 51 42.221 10.41 x 30.65 11.78 2.56 0.18 6 A 3246.302 10.67 X 31.51 12.06 1.53 0.18 7 A 51 46.911 10.5 x 31.52 11.971.51 0.18 8 G 51 51.588 9.05 x 30.85 10.85 2.44 0.15 9 T 51 51.772 9.0530.85 10.85 2.44 0.15

The mapping positions were confirmed when using seed set on a secondaryhead in both locations and when using phenotypic data of F3 progeny ofthis populations the next year in two locations.

Example 3: Fine-Mapping of Rf Region in 18

For further fine-mapping, 40 F2 individuals that were heterozygous inthe QTL region were selected based on phenotype and genotype. A total of2560 individual F3 plants were grown in the field at 2 locations. Foreach plant, seed set on the main head under a bag was measured.Additional SNP assays were developed to increase the marker density inthe QTL interval. A total of 374 additional SNP markers were using inmapping the 1B region. Table 2 provides exemplary SNP markers that weremapped in the region.

Marker-trait association using genetic maps of the chromosome 1B,established on F2 and F3 genotyping data, were determined using R-QTL. Atotal of 1094 individuals with genotype and phenotype data wereprocessed per location. The Rf locus could be further delimited to aregion of about 1.25 cM 1B (from 6.8 to 8.05 cM along chromosome 1B).

TABLE 2 Exemplary markers in the fine-mapped region on 1B. Significantmarkers (highlighted with x) are examples of markers that are in the QTLsupport interval (LOD threshold > 3; drop of 2 LODs from highestmarker). The marker closest to the peak is marked with (v). Othermarkers residing outside the significant interval are indicated by ‘leftflanking region’ (above) and ‘right flanking region (below). SEQ Rfdonor SNP finemap map significant peak ID NO allele position position(cM) marker interval I marker 10 T 51 6.1 11 C 51 7.1 x 12 A 51 7.3 x 13T 51 7.35 x v 14 T 108 8.05 x  5 T 51 8.1

Example 4: Integration of the Fine Map with Partial Genome Sequence andCandidate Gene Identification

Sequence of fine-mapped markers was used for Blasts to contigs andscaffolds of genome sequence of Chinese Spring. Stringent BLAST andparsing criteria were applied to position the SNPs in the partial genomesequence, such as >98% sequence identity, alignment length of >158 bp,hit in 1B sequence, and additional criteria for non-aligning overhang.Scaffolds were ordered to the fine map (and additional genetic maps).Next, the Rf clade of pentatricopeptide repeat protein sequences frommaize, Sorghum, rice and Brachypodium were collected, using a genefamily analysis of Fujii et al. (PNAS, 2011, supra, see Table 51). Atotal of 43 protein sequences were used for BLASTs and identified onelocus in the fine-mapped interval.

The scaffold containing the PPR gene is given as SEQ ID NO. 15.

The thus identified PPR gene is represented by SEQ ID NO. 16 (nt—codingsequence) and 17 (aa).

Manual annotation resulted in the mRNA sequence of SEQ ID NO. 18, withthe coding sequence of SEQ ID NO. 19 encoding the amino acid sequence ofSEQ ID NO. 20.

Example 5—Further Fine-Mapping of Rf Region in 18 (F4) and in SilicoAnalysis

A set of SNP markers that were used for fine-mapping of the Rf3 locuswere aligned to appropriate reference genome(s) to define a physicalregion representing the Rf3 QTL region on the reference genome. This QTLregion was used to identify potential candidate genes and to developadditional markers for BAC-library screening (see below). Structuralannotation of the Rf3 QTL region using ab initio gene annotationprograms an in-house annotation pipeline, as well as by alignment ofwheat EST sequences, wheat FL-cDNA sequences, wheat gene models andknown restorer genes from orthologous species available from publicdatabases. Functional annotation of genes in the QTL region was carriedout using Blast2GO and PLAZA software programs as well as consultationof published literature. These candidate genes were then prioritized onthe basis of their predicted functionality and their homology to knownRf genes (Chen and Liu, 2014; Dahan and Mireau, 2013).

Mapping fine-mapping genetic markers to the ‘Chinese Spring’ referencegenome defined a region of ˜1.3 Mb on chromosome 1B that represented theRf3 QTL region. In the ‘Chinese Spring’ reference, this region containedthe identified pentacotripeptide (PPR) gene. PPR proteins are a largefamily of proteins that are characterized by possession of thecanonical, degenerate 35-amino acid repeat motifs and that have beenidentified in other crops as being involved in restoration of fertility.This is mainly through mechanisms involving modification of theprocessing and/or transcription of cytotoxic mitochondrial transcripts(Dahan and Mireau, 2013; Gaborieau et al., 2016) (Chen and Liu, 2014;Schmitzlinneweber and Small, 2008). Restoration of fertility-type PPRs(Rf-PPRs) are members of the P-class of PPR proteins that typically bindsingle-stranded RNA in a sequence-specific fashion (Barkan et al., 2012;Binder et al., 2013; Chen and Liu, 2014; Gaborieau et al., 2016;Schmitzlinneweber and Small, 2008). Comparison of the sequences of thePPR gene sequences present in the Rf3 QTL region showed that theyclustered with known P-class Rf-PPR orthologues from other crop species(data not shown).

Example 6—BAC Libraries of Restorer Line

In parallel with the in silico analysis (see above), a BAC library wasconstructed for the above described wheat restorer line (hereafterreferred to as ‘Resource-5’), by digesting high-molecular weight‘Resource-5’ gDNA with a restriction enzyme, and transforming theresultant fragments (mean insert size ˜80-130 Kb), into E. coli. Thefine-mapping SNP marker sequences, or markers developed from the Rf3 QTLregion on the reference genome, were then used to design PCR primers toscreen the pooled BAC clones. Once PCR-positive BAC pools had beenidentified, BACs from the pool were individualized and screened againwith the same marker. Individual, PCR-positive BACs were then subjectedto BAC-end sequencing to confirm integrity and the presence of thescreening marker sequences. Finally verified positive BACs were deepsequenced using PacBio technology and reads assembled to generate aconsensus sequence for the BAC insert. Sequenced, positive BACs werethen aligned either by de novo assembly, or by assembly to the referencegenome or tiled using the screening markers to generate a new‘Resource-5’ reference sequence for the Rf3 QTL region. The ‘Resource-5’Rf3 QTL reference sequence was then structurally and functionallyannotated to identify any structural changes and/or differences in genecontent and/or polymorphisms in the candidate gene captured within theregion relative to the (non-restorer) reference genome.

The ‘Resource-5’ BAC library was screened multiple times using PCRmarkers developed from fine-mapping markers, reference genomes orisolated BAC sequences. Fourteen individualized and sequenced BACs werethen tiled to create a contiguous sequence of ˜650 Kb and one additionalsequence of 121 Kb separated by a gap of ˜75 Kb relative to the ‘ChineseSpring’ reference genome. These contigs represent the unique‘Resource-5’ genome sequence for the Rf3 QTL region and were found tocapture the Rf-PPR candidate gene initially identified.

As shown in FIG. 3 A, the gene structure for Rf3-PPR is relativelysimple consisting of a single exon and with no introns. This relativelysimple gene structure appears to be typical for Rf-PPRs.

Comparison of one of the ‘Resource-5’ Rf3-PPR candidate gene to the‘Chinese Spring’ orthologue indicated that the sequence is highlyconserved and that there are no SNPs present either in the CDS or +/−3Kb up and downstream of the CDS. This suggests that the restorerphenotype is not linked to structural differences in the Rf-PPR protein.

SEQ ID NO. 21 represents the genomic DNA sequence of the Rf3-PPR gene

Example 7—Annotation of the PPR Amino Acid Sequence

Known Rf-PPRs are members of the P-class of PPR proteins, and contain upto ˜30 PPR motifs per protein, with each motif comprising 35 amino acids(Gaborieau et al., 2016). Structurally PPR proteins consist of 2α-helices that form a hairpin and a super-groove, and it is this supergroove that interacts with an RNA molecule. The amino acid compositionof the individual PPR motifs determines RNA which nucleotide isrecognized, and the number of PPR motifs determines the length of theRNA sequence on the target transcript. Here the Rf3-PPR candidate wasannotated to identify PPR motifs and other sequence features and theresults summarized in FIGS. 3 B and C.

Rf3-PPR consists of 790 amino acids and contains 18 consecutive 35amino-acid PPR motifs, and a predicted transit peptide that targets theprotein to the mitochondria (SEQ ID NO. 20). This is very similar to thestructure of the Rf-1A gene cloned from rice, which is 791 amino acidslong and contains 16 PPR repeats (Akagi et al., 2004; Komori et al.,2004).

Each PPR motif consists of 2 antiparallel helices that form a hairpinstructure that interacts with a single stranded RNA molecule. Studieshave demonstrated the existence of a recognition code linking theidentity of specific amino acids within the repeats and the target RNAsequence of the PPR protein studied (Barkan et al., 2012; Yagi et al.,2013). In particular the identity of the 5th and the 35th amino acids ofeach motif have been shown to be particularly important and in thecontext of CMS, specificity is essential to specifically target theCMS-conferring transcript. On the basis of the identity of the aminoacids at positions 5 and 35 in the Rf-PPR motif the predicted targettranscript sequence for Rf3-PPR can be determined. Following the PPRcode (Melonek et al., 2016, supra), the predicted RNA target sequence isthus 5′-ACCUGUNCGUAYNYGCAU-3′ (SEQ ID NO. 22, see also Table 3 below).

As shown in FIG. 4, alignment of the predicted target sequence ofRf3-PPR to the chimeric mitochondrial ORF-256 transcript (SEQ ID NO.23), which has been proposed to be responsible for the CMS phenotype(Hedgcoth et al., 2002) indicates that there is a potential interactionsite at positions 129-146 (sequence ACTGCTTTCTATTTGCAC).

The results here indicate that Rf3-PPR potentially binds the chimericORF256 transcript responsible for the CMS phenotype rand where it isthought to act by reducing the steady-state level of the deleteriousORF256 either by decreasing the stability of the corresponding RNA or byreducing translation (Binder et al., 2013).

TABLE 3 PPR motifs and base recognition-See also FIG. 3. Aa positionsPPR motif (SEQ ID NO. 20) Position 5 and 35 Base recognition  1 121-155GN A  2 156-191 NN C  3 192-226 NN C  4 227-261 ND U  5 262-296 SD G  6297-331 ND U  7 332-336 AN ?  8 367-401 NN C  9 402-436 SD G 10 437-472RD U 11 473-507 SN A 12 508-542 NC C/U 13 543-577 GT ? 14 578-612 NC C/U15 613-647 SD G 16 648-682 NN C 17 683-717 TN A 18 718-752 NE U

Example 8—Expression Analysis

mRNA

Total RNA was isolated from ˜70-100 mgfw tissue using the Sigma SpectrumPlant Total RNA Kit (Sigma-Aldrich), and any gDNA contamination removedusing the Qiagen RNase-Fee DNase Set (Cat. No. 79254). DNA concentrationand integrity were determined with an Agilent Expert BioAnalyser. Tissuewas sampled at four developmental stages (young leaf, spike 2.5-3.5,spike 3.5-4.5, spike 4.5-5.5 cm and anthers), using individuals from anF4-population of progeny derived from ‘Resource-5’. These progeny weregenotyped using fine-mapping markers, phenotyped for fertility traits,and classified as either non-restoring (−/−),or heterozygous for Rf3(Rf3/−). Three individual biological replicates were prepared per tissuetype per genotype.

qRT-PCR Analyses

mRNA from each of the tissue/Rf3 genotypes was converted into cDNA usingthe EcoMix dry kit from Clonetech. Gene-specific probes were designed toquantify gene expression levels using the TaqMan assay as summarized intable 4. Probe specificity and efficiency were tested and optimised andexpression analyses carried out on cDNA samples generated as above.

TABLE 4TaqMan primer and probe sequences used for gene expression analyses.Gene i.d. Name Type Target Region Sequence 5′ --> 3′ SEQ ID NO. Rf3-PPRFw2 Primer 1648..1671 TGATGGTGTTGGACCTGATAATGT 24 Rev2 Primercomplement(1696..1717) CCAGTGGCCTGAAGAGGAATAT 25 P2 Probe 1673..1693ACGTATAGTAGCCTCATCCAT 26

Gene expression was examined in individual plants selected from f4fine-mapping progeny segregating for the Rf3 locus, in four differenttissues. Young leaf, developing spike 2.5-3.5 cm, developing spike3.5-4.5 cm, developing spike 4.5-5.5 cm and anthers. Since it isexpected that the cytoplasmic male sterile phenotype is due to theproduction of non-viable pollen, Rf genes must at least be expressedduring the period of pollen development and meiosis. It is also expectedthat Rf gene expression will be highest in the early stages of pollendevelopment.

As shown in FIG. 5, it is clear that mean expression of the PPR gene, isexclusively associated with the presence of the Rf3 locus, and is alsohighest at the 3.5-4.5 cm stage of spike development.

The Resource-5 Rf3-PPR candidate does not possess any SNPs orpolymorphisms relative to the non-restorer Chinese Spring reference,within a 12 Kb region centered around the CDS. However it is situatedat/or near the QTL peak for the restoration phenotype and expression isexactly correlated with the presence of an active Rf3 locus. The Rf3-PPRdoes however have multiple predicted miRNA binding sites in the region160-270 bp 5′ to the ATG start and is well documented that PPRs inparticular are subject to regulation by sRNAs/miRNAs (Xia et al., 2013).E.g. in rice, expression of a lncRNA that produces 21 nt sRNA, isrequired for pollen development under long-day conditions, and a singleSNP that alters the secondary lncRNA structure, leads to increasedmethylation of the promoter region of this lncRNA, reducingtranscription and resulting on premature programmed cell death indeveloping anthers (Ding et al., 2012). Similarly Ding et aldemonstrated that a single polymorphism in the rice sRNA osa-smR5864m,is a common cause for pollen sterility in japonica and indica lines(Ding et al., 2012). Wei et al also identified miRNAs responsible forpollen abortion in Chinese cabbage (Wei et al., 2015) and altered miRNAexpression has been associated with male sterility in pumello (Fang etal., 2016) asparagus (Chen et al., 2016) cotton (Wei et al., 2013).Therefore expression could be driven by a trans-acting miRNA/sRNAimpacting transcript stability or transcription.

Example 9 Candidate Gene Validation By Mutagenesis

A mutagenized population of the restore line is constructed. Based onsequencing, mutant plants with an inactivating mutation in the Rfcandidate PPR gene are identified. The homozygous mutant plants andtheir wildtype segregants are screened for fertility restorationcapacity. The plants that have a mutated PPR gene no longer hasrestoring ability, confirming that the identified candidate PPR gene isa functional Rf gene.

By Overexpression

The coding sequence of the candidate PPR-Rf gene is cloned under thecontrol of a constitutive UBIQUITIN promoter (e.g. pUbiZm from maize),or under the control of a constitutive cauliflower mosaic virus promoter(p35S), or under the control of a vernalisation-related barley promoter(pvrn1) (or under control of its native promoter), in a T-DNA expressionvector comprising a selectable marker, such as the bar gene. Theresulting vector is transformed into a wheat line having no restorationcapacity such as the transformable variety Fielder (or Chinese spring)according to methods well known in the art for wheat transformation (seee.g. lshida et al Methods Mol Biol. 2015; 1223:189-98). The copy numberof the transgene in the transgenic plant is determined by real time PCRon the selectable marker gene. The transformed plants comprising thecandidate PPR-Rf gene cassette, preferably in single copy, aretransferred to the greenhouse. Expression of the transgene in leaftissue and in young developing spikes is tested by qRT-PCR. TransgenicTO plants expressing the candidate PPR-Rf gene are crossed as maleparents to a G-type cytoplasmic male sterile (“CMS”) wheat line. F1progeny of the crosses contain the G-type cytoplasm and show partial orcomplete restoration of male fertility due to the presence of thecandidate PPR Rf gene.

The level of restoration in F1 progeny is tested using four differentassays. In the first assay the mitochondrial ORF256 protein isquantified on Western blot using polyclonal antibodies raised againstsynthetic ORF256 protein. Expression of a functional candidate PPR Rfgene leads to reduced accumulation of the ORF256 protein. In the secondassay pollen accumulation and pollen viability is quantified using theAmphaZ30 device. Expression of a functional candidate PPR Rf gene leadsto higher numbers of viable pollen. In the third assay the integrity ofanther tissues is inspected microscopically. Expression of a functionalcandidate PPR-Rf gene leads to better preservation of functional tapetumlayer. In the fourth assay seed set per ear from self-pollination isquantified. Expression of a functional candidate PPR-Rf gene leads tohigher number of grains per ear. In all tests the F1 progeny fromcrosses of non-transgenic Fielder plants to the same G-type cytoplasmicmale sterile (“CMS”) wheat line serves as a control.

By Targeted Knock-Out

Guide RNAs for CRISPR-mediated gene editing targeting the mRNA codingsequence, preferably the protein coding sequence of the candidate PPR Rfgene, or the immediately upstream promoter sequence of the candidate PPRRf gene are designed by using e.g. the CAS-finder tool. Preferably fourunique or near-unique guide RNAs are designed per target gene. The guideRNAs are tested for targeting efficiency by PEG-mediated transientco-delivery of the gRNA expression vector with an expression vector forthe respective nuclease, e.g. Cas9 or Cpf1, under control of appropriatepromoters, to protoplasts of a wheat restorer line containing thecandidate PPR-Rf gene of interest, preferably the line designated asT.timopheevii /2* lowin //2* Quivira, USDA Accession number PI 583676.Genomic DNA is extracted from the protoplasts after delivery of theguide RNA and nuclease vectors. After PCR amplification, integrity ofthe targeted candidate PPR Rf gene sequence is assessed by sequencing.

The one or two most efficient guide RNAs are used for stable geneediting in same wheat restorer line also containing the G-type CMScytoplasm. For this purpose, the selected guide RNA expression vector,together with a nuclease expression module and a selectable marker gene,are introduced into embryos isolated from the before mentioned wheatrestorer line using e.g. particle gun bombardment. Transgenic plantsshowing resistance to the selection agent are regenerated using methodsknown to those skilled in the art. Transgenic TO plants containing genetargeting events, preferably small deletions likely resulting in anon-functional target candidate Rf PPR gene are identified by PCRamplification and sequencing.

Transgenic TO plants containing the G-type CMS cytoplasm and likely tocontain a functional knock-out of the candidate PPR-Rf gene, preferablyin homozygous state, but alternatively in heterozygous state, arecrossed as female parents to a spring wheat line with normal cytoplasmand without PPR-Rf genes. The F1 progeny of the crosses contains theG-type “CMS” cytoplasm and 50% (in case of heterozygous TO) or 100% (incase of homozygous TO) of the F1 progeny will lack a functional versionof the target Rf PPR gene. The F1 plants lacking a functional target RfPPR gene are identified using genomic PCR assays. The F1 plants showpartial or complete loss of male fertility due to the knock-out of thecandidate PPR Rf gene.

The level of male fertility in the F1 progeny lacking a functionalversion of the candidate Rf PPR gene is tested using four differentassays. In the first assay the mitochondrial ORF256 protein isquantified on Western blot using polyclonal antibodies raised againstsynthetic ORF256 protein. The knock-out of a functional candidate PPR Rfgene leads to increased accumulation of the ORF256 protein. In thesecond assay pollen accumulation and pollen viability is quantifiedusing the AmphaZ30 device. The knock-out of a functional candidate PPRRf gene leads to lower numbers of viable pollen. In the third assay theintegrity of anther tissues is inspected microscopically. The knock-outof a functional candidate PPR Rf gene leads to early deterioration ofthe tapetum layer. In the fourth assay seed set per ear fromself-pollination is quantified. The knock-out of a functional candidatePPR Rf gene leads to reduced number of grains per ear. In all tests theF1 progeny from crosses of non-edited Rf plants to the same spring wheatline serve as a control.

REFERENCES

-   Akagi, H., Nakamura, A., Yokozeki-Misono, Y., Inagaki, A.,    Takahashi, H., Mori, K., and Fujimura, T. (2004). Positional cloning    of the rice Rf-1 gene, a restorer of BT-type cytoplasmic male    sterility that encodes a mitochondria-targeting PPR protein. Theor.    Appl. Genet. 108, 1449-1457.-   Barkan, A., Rojas, M., Fujii, S., Yap, A., Chong, Y. S., Bond, C.    S., and Small, I. (2012). A Combinatorial Amino Acid Code for RNA    Recognition by Pentatricopeptide Repeat Proteins. PLoS Genet. 8,    e1002910.-   Binder, S., Stoll, K., and Stoll, B. (2013). P-class    pentatricopeptide repeat proteins are required for efficient 5′ end    formation of plant mitochondrial transcripts. RNA Biol. 10,    1511-1519.-   Chen, L., and Liu, Y.-G. (2014). Male Sterility and Fertility    Restoration in Crops. Annu. Rev. Plant Biol. 65, 579-606.-   Chen, J., Zheng, Y., Qin, L., Wang, Y., Chen, L., He, Y., Fei, Z.,    and Lu, G. (2016). Identification of miRNAs and their targets    through high-throughput sequencing and degradome analysis in male    and female Asparagus officinalis. BMC Plant Biol. 16, 80.-   Dahan, J., and Mireau, H. (2013). The Rf and Rf-like PPR in higher    plants, a fast-evolving subclass of PPR genes. RNA Biol. 10,    1469-1476.-   Ding, J., Lu, Q., Ouyang, Y., Mao, H., Zhang, P., Yao, J., Xu, C.,    Li, X., Xiao, J., and Zhang, Q. (2012). A long noncoding RNA    regulates photoperiod-sensitive male sterility, an essential    component of hybrid rice. Proc. Natl. Acad. Sci. 109, 2654-2659.-   Fang, Y.-N., Zheng, B.-B., Wang, L., Yang, W., Wu, X.-M., Xu, Q.,    and Guo, W.-W. (2016). High-throughput sequencing and degradome    analysis reveal altered expression of miRNAs and their targets in a    male-sterile cybrid pummelo (Citrus grandis). BMC Genomics 17, 591.-   Gaborieau, L., Brown, G. G., and Mireau, H. (2016). The Propensity    of Pentatricopeptide Repeat Genes to Evolve into Restorers of    Cytoplasmic Male Sterility. Front. Plant Sci. 7.-   Hedgcoth, C., EI-Shehawi, A. M., Wei, P., Clarkson, M., and    Tamalis, D. (2002). A chimeric open reading frame associated with    cytoplasmic male sterility in alloplasmic wheat with Triticum    timopheevi mitochondria is present in several Triticum and Aegilops    species, barley, and rye. Curr. Genet. 41, 357-366.-   Komori, T., Ohta, S., Murai, N., Takakura, Y., Kuraya, Y., Suzuki,    S., Hiei, Y., Imaseki, H., and Nitta, N. (2004). Map-based cloning    of a fertility restorer gene, Rf-1, in rice (Oryza sativa L.).    Plant J. 37, 315-325.-   Schmitzlinneweber, C., and Small, I. (2008). Pentatricopeptide    repeat proteins: a socket set for organelle gene expression. Trends    Plant Sci. 13, 663-670.-   Wei, M., Wei, H., Wu, M., Song, M., Zhang, J., Yu, J., Fan, S., and    Yu, S. (2013). Comparative expression profiling of miRNA during    anther development in genetic male sterile and wild type cotton. BMC    Plant Biol. 13, 66.-   Wei, X., Zhang, X., Yao, Q., Yuan, Y., Li, X., Wei, F., Zhao, Y.,    Zhang, Q., Wang, Z., Jiang, W., et al. (2015). The miRNAs and their    regulatory networks responsible for pollen abortion in Ogura-CMS    Chinese cabbage revealed by high-throughput sequencing of miRNAs,    degradomes, and transcriptomes. Front. Plant Sci. 6.-   Xia, R., Meyers, B. C., Liu, Z., Beers, E. P., Ye, S., and Liu, Z.    (2013). MicroRNA Superfamilies Descended from miR390 and Their Roles    in Secondary Small Interfering RNA Biogenesis in Eudicots. Plant    Cell Online 25, 1555-1572.-   Yagi, Y., Hayashi, S., Kobayashi, K., Hirayama, T., and Nakamura, T.    (2013). Elucidation of the RNA Recognition Code for    Pentatricopeptide Repeat Proteins Involved in Organelle RNA Editing    in Plants. PLoS ONE 8, e57286.

1. A nucleic acid molecule encoding a functional restorer gene allelefor wheat G-type cytoplasmic male sterility, wherein said functionalrestorer gene allele localizes to the scaffold as represented by SEQ IDNO
 15. 2. The nucleic acid molecule of claim 1, wherein said functionalrestorer gene allele is a functional allele of a PPR gene localising tosaid scaffold.
 3. The nucleic acid of claim 1 or 2, wherein saidfunctional restorer gene allele is a functional allele of a PPR geneencoded by SEQ ID NO. 19, SEQ ID NO. 18 or SEQ ID NO. 16, SEQ ID NO. 21or of the polypeptide of SEQ ID NO. 20 or SEQ ID NO.
 17. 4. The nucleicacid of any one of claims 1-3, wherein said functional restorer gene isselected from a. A nucleic acid comprising a nucleotide sequence havingat least 85% sequence identity to SEQ ID NO. 19, SEQ ID NO 18, SEQ IDNO: 16, SEQ ID NO. 21; b. A nucleic acid encoding a polypeptide havingat least 85% sequence identity to SEQ ID NO.20 or SEQ ID NO.
 17. 5. Thenucleic acid of any one of claims 1-4, wherein said functional restorergene allele encodes a PPR protein capable of binding to the mRNA ofORF256, preferably to nt 129-146 of SEQ ID NO.
 23. 6. The nucleic acidof any one of claims 1-5, wherein said functional restorer gene alleleis obtainable from USDA accession number PI
 583676. 7. The nucleic acidof any one of claims 1-6, wherein said functional restorer gene allelecomprises the nucleotide sequence of SEQ ID NO. 19, SEQ ID NO 18, SEQ IDNO: 16, SEQ ID NO. 21 or wherein said functional restorer gene alleleencodes the polypeptide of SEQ ID NO.20 or SEQ ID NO. 17
 8. Apolypeptide encoded by the nucleic acid molecule of any one of claims1-7.
 9. A chimeric gene comprising the following operably linkedelements a. a plant-expressible promoter; b. a nucleic acid comprisingthe nucleic acid molecule of any one of claim 1-7 or encoding thepolypeptide of claim 8; and optionally c. a transcription terminationand polyadenylation region functional in plant cells, wherein at leastone of said operably linked elements is heterologous with respect to atleast one other element.
 10. The chimeric gene of claim 9, wherein saidpromoter is capable of directing expression of the operably linkednucleic acid at least during (early) pollen development and meiosis,such as in anther or, more specifically, tapetum, or developingmicrospores.
 11. A cereal plant cell or cereal plant or seed thereof,such as a wheat plant cell or plant or seed thereof, comprising thenucleic acid molecule of any one of claims 1-7, polypeptide of claim 8,or the chimeric gene of claim 9 or 10, wherein said polypeptide, saidnucleic acid, or said chimeric gene in each case is heterologous withrespect to said plant cell or plant or seed.
 12. The plant cell, plantor seed of claim 11, wherein the polypeptide of claim 8 is expressed atleast during (early) pollen development and meiosis, such as in antheror, more specifically, tapetum, or developing microspore.
 13. The plantcell, plant or seed of claim 11 or 12, which is a hybrid plant cell,plant or seed.
 14. A method for producing a cereal plant cell or plantor seed thereof, such as a wheat plant cell or plant or seed thereof,comprising a functional restorer gene for wheat G-type cytoplasmic malesterility, or for increasing restoration capacity for wheat G-typecytoplasmic male sterility (“CMS”) in a cereal plant, such as a wheatplant, comprising the steps of providing said plant cell or plant withthe nucleic acid molecule of any one of claims 1-7 or the chimeric geneof claim 9 or 10, wherein said providing comprises transformation,crossing, backcrossing, genome editing or mutagenesis.
 15. A method forproducing a cereal plant cell or plant or seed thereof, such as a wheatplant cell or plant or seed thereof, comprising a functional restorergene for wheat G-type cytoplasmic male sterility, or for increasingrestoration capacity for wheat G-type cytoplasmic male sterility (“CMS”)in a cereal plant, such as a wheat plant, comprising the steps ofincreasing the expression of a polypeptide according claim 8 in saidplant cell or plant or seed.
 16. A method for converting a non-restoringcereal plant, such as a wheat plant, into a restoring plant for wheatG-type cytoplasmic male sterility (“CMS”), or for increasing restorationcapacity for wheat G-type cytoplasmic male sterility (“CMS”) in a cerealplant, such as a wheat plant, comprising the steps of modifying thegenome of said plant to comprise the nucleic acid molecule of any one ofclaims 1-7 or the chimeric gene of claim 9 or 10, wherein said modifyingcomprises transformation, crossing, backcrossing, genome editing ormutagenesis.
 17. A method for converting a non-restoring cereal plant,such as a wheat plant, into a restoring plant for wheat G-typecytoplasmic male sterility (“CMS”), or for increasing restorationcapacity for wheat G-type cytoplasmic male sterility (“CMS”) in a cerealplant, such as a wheat plant, comprising the steps of modifying thegenome of said plant to increase the expression of a polypeptideaccording to claim 8 in said plant.
 18. A cereal plant cell or cerealplant or seed thereof, such as a wheat plant cell or plant or seedthereof, obtained according to the method of any one of claims 14-17,preferably wherein said plant has an increased restoration capacity forwheat G-type cytoplasmic male sterility (“CMS”).
 19. The plant cell,plant or seed of claim 18, which is a hybrid plant cell, plant or seed.20. A method for identifying and/or selecting a cereal (e.g. wheat)plant comprising a functional restorer gene allele for wheat G-typecytoplasmic male sterility comprising the steps of a. Identifying ordetecting in said plant the presence of a nucleic acid of any one ofclaims 1-7 or of the polypeptide according to claim 8, or the chimericgene of claim 9 or 10 b. and optionally selecting said plant comprisingsaid nucleic acid or polypeptide or chimeric gene
 21. The method ofclaim 20, wherein said polypeptide is expressed at least during (early)pollen development and meiosis, such as in anther or, more specifically,tapetum, or developing microspore.
 22. A method for producing a cerealplant, such as a wheat plant, comprising a functional restorer geneallele for wheat G-type cytoplasmic male sterility, comprising the stepsof a. crossing a first cereal plant, such as a wheat plant, comprising afunctional restorer gene for wheat G-type cytoplasmic male sterility ofany one of claim 11, 12 or 18 with a second cereal plant b. identifyinga progeny plant comprising a functional restorer gene allele for wheatG-type cytoplasmic male sterility according to the method of claim 20 or21.
 23. A method for producing a cereal plant, such as a wheat plant,comprising a functional restorer gene allele for wheat G-typecytoplasmic male sterility, comprising the steps of a. crossing a firstcereal plant, such as a wheat plant, homozygous for a functionalrestorer gene for wheat G-type cytoplasmic male sterility of any one ofclaim 11, 12 or 18 with a second cereal plant b. obtaining a progenyplant, wherein said progeny plant comprises said functional restorergene allele for wheat G-type cytoplasmic male sterility.
 24. A methodfor producing hybrid seed, comprising the steps of: a. Providing a malecereal parent plant, such as a wheat plant, according to claim 11, 12 or18, said plant comprising said functional restorer gene allele for wheatG-type cytoplasmic male sterility, wherein said functional restorer geneallele is preferably present in homozygous form. b. Providing a femalecereal parent plant that is a G-type cytoplasmic male sterile cerealplant. c. Crossing said female cereal parent plant with a said malecereal parent plant; and optionally d. Harvesting seeds.
 25. Use of thenucleic acid of any one of claims 1-7 to identify one or more furtherfunctional restorer gene alleles for wheat G-type cytoplasmic malesterility.
 26. Use of the nucleic acid of any one of claims 1-7 or ofthe polypeptide according to claim 8 or of the chimeric gene of claim 9or 10 for the identification of a plant comprising said functionalrestorer gene allele for wheat G-type cytoplasmic male sterility. 27.Use of a plant according to any one of claim 11, 12 or 18 or a plantobtained by the method of any one of claim 14-17 or 23-24, said plantcomprising said functional restorer gene for wheat G-type cytoplasmicmale sterility, for restoring fertility in a progeny of a G-typecytoplasmic male sterile cereal plant, such as a wheat plant.
 28. Use ofa plant according to any one of claim 11, 12 or 18 or a plant obtainedby any one of claim 14-17 or 22-23, said plant comprising saidfunctional restorer gene for wheat G-type cytoplasmic male sterility,for producing hybrid seed or a population of hybrid cereal plants, suchas wheat seed or plants.